The R language provides the cut() function, which allows you to cut a numeric vector into a factor, which is very useful in categorizing data. This tutorial will share how to use the **cut() function in R**.

**What is the cut() function in R**?

The cut() function converts a numeric vector to a factor by dividing the range of the specified numeric vector into intervals and codes the values in that range.

**Syntax:**

`cut(numeric vector, breaks, labels, include.lowest, right, dig.lab, ordered_result, …)`

**Parameters:**

**numeric vector**: a numeric vector to cut.**breaks**: number of intervals into which a numeric vector is to be cut.**labels**: labels for the levels of the results. The default is NULL.**include.lowest**: include the lowest ‘break’ value. The default is FALSE.**right**: the intervals be closed on the right. The default is TRUE.**dig.lab**: the number of digits used in formatting the break numbers. The default is 3.**ordered_result**: result be an ordered factor. The default is FALSE.

**How to use the cut() function in R**?

We create a vector x containing the integers from 0 to 5. Here are some examples of how to use the cut() function to categorize data on the vector x.

In the first code, we set the ‘breaks’ parameter to 2 (an integer), so the cut() function cuts the vector x into 2 equal intervals.

**Code:**

# Create an vector x <- 1:5 # Cut the x vector into 2 equal intervals cut(x, breaks = 2)

**Output:**

```
[1] (0.996,3] (0.996,3] (0.996,3] (3,5] (3,5]
Levels: (0.996,3] (3,5]
```

In the following code, we set the parameter ‘breaks’ to a numeric vector of 3 cut points, then the cut() function cuts the value of the x vector into 2 intervals. The first level corresponds to the leftmost interval.

**Code:**

# Create an vector x <- 1:5 # Cut the vector x by more unique cut points cut(x, breaks = c(1, 2, 5))

**Output:**

```
[1] <NA> (1,2] (2,5] (2,5] (2,5]
Levels: (1,2] (2,5]
```

By default, the ‘right’ parameter has a TRUE value, meaning that intervals are opened on the left and closed on the right. If the ‘right’ parameter is set to FALSE, the intervals will be closed on the left and opened on the right.

**Code:**

# Create an vector x <- 1:5 # The intervals should be closed on the left cut(x, breaks = c(1, 2, 5), right = FALSE)

**Output:**

```
[1] [1,2) [2,5) [2,5) [2,5) <NA>
Levels: [1,2) [2,5)
```

By default, labels are formatted as interval notation (a,b]. In this example, we set labels for levels of output. Note that the number of labels set must equal the number of intervals the cut() function returns.

**Code:**

# Create an vector x <- 1:5 # Set labels for the levels cut(x, breaks = c(1, 2, 5), labels = c("Group1", "Group2") )

**Output:**

```
[1] <NA> Group1 Group2 Group2 Group2
Levels: Group1 Group2
```

If we set the ‘include.lowest’ parameter to TRUE, the cut() function will include the value equal to the lowest ‘breaks’ value (or the highest ‘breaks’ value if ‘right = TRUE’).

**Code:**

# Create an vector x <- 1:5 # Closed to the left of the lowest 'breaks' value cut( x, breaks = c(1, 2, 5), labels = c("Group1", "Group2"), include.lowest = TRUE )

**Output:**

```
[1] Group1 Group1 Group2 Group2 Group2
Levels: Group1 Group2
```

**Complete code**

We use the cut() function to classify some students’ test scores.

**Example:**

Name <-c( "Frank", "Charles", "Johnny", "Orlando", "Bruce", "Lynda", "Alice", "Robin", "Charles", "Hanna" ) Scores <- c(75, 40, 39, 5, 67, 90, 55, 78, 0, 86) Grade <- cut( Scores, breaks = c(0, 39, 44, 49, 54, 59, 64, 69, 79, 100), labels = c("F", "E", "D", "C", "C+", "B", "B+", "A", "A+"), include.lowest = TRUE ) gradePoint <- data.frame(Name, Scores, Grade) gradePoint

**Output:**

```
Name Scores Grade
1 Frank 75 A
2 Charles 40 E
3 Johnny 39 F
4 Orlando 5 F
5 Bruce 67 B+
6 Lynda 90 A+
7 Alice 55 C+
8 Robin 78 A
9 Charles 0 F
10 Hanna 86 A+
```

We ranged scores from 0 to 100 with labels from “F” to “A+” respectively. The test score can be 0, so we set ‘include.lowest’ to TRUE, so the result will also include the value 0.

**Summary**

We have already shared how to use the **cut() function in R**. If you want the output to include the lowest ‘breaks’ value (or the highest with ‘right = FALSE’), you must set ‘include.lowest’ to TRUE. Thank you for reading.

## Leave a Reply