Continuing our deep dive into Pandas, this blog will focus on the different ways to create Series, DataFrames, and Panels. Understanding these methods is essential as it provides the flexibility to handle data in various forms. Let’s explore these data structures and their creation methods in detail.
For a foundational understanding of these concepts, you might want to read our previous blogs on Comprehensive Guide to Data Types in Pandas: DataFrame, Series, and Panel and Pandas in Python: Your Ultimate Guide to Data Manipulation.
Creating Series in Pandas
A Series is a one-dimensional labeled array capable of holding any data type (integer, string, float, Python objects, etc.). Here’s how you can create a Series in multiple ways:
Creating a Series from a List
import pandas as pd
# Creating a Series from a list
data = [1, 2, 3, 4, 5]
series = pd.Series(data)
print(series)
Creating a Series with a Custom Index
# Creating a Series with a custom index
series = pd.Series(data, index=['a', 'b', 'c', 'd', 'e'])
print(series)
Creating a Series from a Dictionary
# Creating a Series from a dictionary
data = {'a': 1, 'b': 2, 'c': 3}
series = pd.Series(data)
print(series)
Creating a Series from a NumPy Array
import numpy as np
# Creating a Series from a NumPy array
data = np.array([1, 2, 3, 4, 5])
series = pd.Series(data)
print(series)
Creating a Series from a Scalar Value
# Creating a Series from a scalar value
series = pd.Series(5, index=['a', 'b', 'c', 'd', 'e'])
print(series)
Creating DataFrames in Pandas
A DataFrame is a two-dimensional labeled data structure with columns of potentially different types. Here’s how you can create a DataFrame:
Creating a DataFrame from a Dictionary
# Creating a DataFrame from a dictionary
data = {
'Name': ['John', 'Anna', 'Peter', 'Linda'],
'Age': [28, 24, 35, 32],
'City': ['New York', 'Paris', 'Berlin', 'London']
}
df = pd.DataFrame(data)
print(df)
Creating a DataFrame from a List of Dictionaries
# Creating a DataFrame from a list of dictionaries
data = [
{'Name': 'John', 'Age': 28, 'City': 'New York'},
{'Name': 'Anna', 'Age': 24, 'City': 'Paris'},
{'Name': 'Peter', 'Age': 35, 'City': 'Berlin'},
{'Name': 'Linda', 'Age': 32, 'City': 'London'}
]
df = pd.DataFrame(data)
print(df)
Creating a DataFrame from a List of Lists
# Creating a DataFrame from a list of lists
data = [
['John', 28, 'New York'],
['Anna', 24, 'Paris'],
['Peter', 35, 'Berlin'],
['Linda', 32, 'London']
]
df = pd.DataFrame(data, columns=['Name', 'Age', 'City'])
print(df)
Creating a DataFrame from a NumPy Array
import numpy as np
# Creating a DataFrame from a NumPy array
data = np.array([
['John', 28, 'New York'],
['Anna', 24, 'Paris'],
['Peter', 35, 'Berlin'],
['Linda', 32, 'London']
])
df = pd.DataFrame(data, columns=['Name', 'Age', 'City'])
print(df)
Creating a DataFrame from Another DataFrame
# Creating a DataFrame from another DataFrame
data = {
'Name': ['John', 'Anna', 'Peter', 'Linda'],
'Age': [28, 24, 35, 32],
'City': ['New York', 'Paris', 'Berlin', 'London']
}
df1 = pd.DataFrame(data)
# Selecting specific columns to create a new DataFrame
df2 = df1[['Name', 'Age']]
print(df2)
Creating Panels in Pandas
A Panel is a three-dimensional data structure, but it has been deprecated since Pandas 0.25.0. Users are encouraged to use MultiIndex DataFrames instead. However, for completeness, here’s how Panels were created:
Creating a Panel from a Dictionary of DataFrames
# Creating a Panel from a dictionary of DataFrames
data = {
'Item1': pd.DataFrame(np.random.randn(4, 3)),
'Item2': pd.DataFrame(np.random.randn(4, 3))
}
panel = pd.Panel(data)
print(panel)
Accessing Data in a Panel
# Accessing data by item
print(panel['Item1'])
# Accessing data by major and minor axis
print(panel.major_xs(1))
print(panel.minor_xs(1))
Operations on Panels
# Descriptive statistics
print(panel.describe())
# Transposing the Panel
print(panel.transpose(2, 0, 1))
Conclusion
In this continuation, we have explored the various ways to create Series, DataFrames, and Panels in Pandas. Each method provides flexibility to handle different types of data sources and structures, making Pandas a versatile tool for data analysis.
For more detailed insights and foundational concepts, refer to our previous blogs on Comprehensive Guide to Data Types in Pandas: DataFrame, Series, and Panel and Pandas in Python: Your Ultimate Guide to Data Manipulation.
Keep experimenting with these data structures to enhance your data manipulation skills. Happy coding!