When working with data in Python, one of the most important skills is data indexing and selection. It allows you to extract specific rows, columns, or values from a dataset efficiently. In data science, this is commonly done using the library pandas, which provides powerful tools like loc and iloc.

Note: These methods belong to pandas, not NumPy. numpy is mainly used for numerical operations, while pandas is designed for structured data like tables.
What is Indexing in Data Analysis?
Indexing means selecting specific parts of a dataset. In real-world data, you rarely use the entire dataset at once. Instead, you extract relevant rows or columns.
For example:
- Selecting a student’s record from a table
- Filtering sales data for a specific month
- Extracting a column like “Salary” or “Age”
This is where loc and iloc become very useful.
Introduction to loc and iloc
1. loc (Label-based indexing)
The loc function is used to select data using labels (names of rows or columns).
Syntax:
df.loc[row_label, column_label]
Example:
import pandas as pddata = {
'Name': ['Amit', 'Riya', 'John'],
'Age': [20, 22, 21]
}df = pd.DataFrame(data, index=['a', 'b', 'c'])print(df.loc['a'])
Output:
Name Amit
Age 20
Name: a, dtype: object
Key Points of loc:
- Uses row/column labels
- Includes both start and end labels when slicing
- Supports boolean conditions
Example with condition:
df.loc[df['Age'] > 20]
2. iloc (Integer-based indexing)
The iloc function is used for selecting data based on integer position (index numbers).
Syntax:
df.iloc[row_index, column_index]
Example:
print(df.iloc[0])
Output:
Name Amit
Age 20
Name: a, dtype: object
Key Points of iloc:
- Uses integer positions (0, 1, 2, …)
- Works like Python list indexing
- Does NOT include the end index in slicing
Example:
print(df.iloc[0:2])
This will return the first two rows only.
Difference Between loc and iloc
| Feature | loc | iloc |
|---|---|---|
| Type | Label-based | Integer-based |
| Input | Names/labels | Index numbers |
| Slicing | Inclusive | Exclusive of end |
| Usage | Real-world labeled data | Position-based selection |
Practical Example
import pandas as pddata = {
'Student': ['Amit', 'Riya', 'John', 'Sara'],
'Marks': [85, 90, 78, 88]
}df = pd.DataFrame(data)# Using loc
print(df.loc[1, 'Student'])# Using iloc
print(df.iloc[1, 0])
Both will output:
Riya
Why loc and iloc are Important?
In data science and machine learning, datasets are often large. Efficient data selection helps in:
- Cleaning data
- Filtering useful information
- Preparing training datasets
- Performing analysis faster
Without proper indexing, handling large datasets becomes difficult and inefficient.
Common Mistakes to Avoid
- Confusing loc and iloc
- loc → labels
- iloc → positions
- Using string labels in iloc (not allowed)
- Forgetting slicing rules:
- loc includes end value
- iloc excludes end value
Understanding indexing and selecting data using loc and iloc is essential for anyone learning data analysis with pandas. While numpy is powerful for numerical computations, pandas provides structured data handling features that make data selection simple and efficient.
Mastering these concepts will help you work confidently with datasets, perform analysis faster, and build a strong foundation for data science and machine learning.
For More Information and Updates, Connect With Us
- Name Sumit singh
- Phone Number: +91-9264477176
- Email ID: emancipationedutech@gmail.com
- Our Platforms:
- Digilearn Cloud
- Live Emancipation
- Follow Us on Social Media:
- Instagram – Emancipation
- Facebook – Emancipation
Stay connected and keep learning with Emancipation!

Leave a Reply