Open In Colab

Python Pandas¶

Pandas is one of the most important libraries of Python. Pandas has data structures for data analysis. The most used of these are Series and DataFrame data structures. Series is one dimensional, that is, it consists of a column. Data frame is two-dimensional, i.e. it consists of rows and columns.

To install Pandas, you can use "pip install pandas"

Series Data Structure¶

Series is a one-dimensional array-like object that can hold data of any type. It is similar to a column in a table.

In [2]:
import pandas as pd
import numpy as np

pd.__version__
Out[2]:
'2.2.3'
In [3]:
obj=pd.Series([1,"John",3.5,"Hey"])
obj
Out[3]:
0       1
1    John
2     3.5
3     Hey
dtype: object
In [4]:
obj.values
Out[4]:
array([1, 'John', 3.5, 'Hey'], dtype=object)
In [5]:
obj2=pd.Series([1,"John",3.5,"Hey"],index=["a","b","c","d"])
obj2
Out[5]:
a       1
b    John
c     3.5
d     Hey
dtype: object
In [6]:
obj2["b"] 
Out[6]:
'John'
In [7]:
obj2.index
Out[7]:
Index(['a', 'b', 'c', 'd'], dtype='object')
In [8]:
score={"Jane":90, "Bill":80,"Elon":85,"Tom":75,"Tim":95}
names=pd.Series(score) # Convert to Series 
names
Out[8]:
Jane    90
Bill    80
Elon    85
Tom     75
Tim     95
dtype: int64
In [9]:
names["Tim"] 
Out[9]:
95
In [10]:
names[names>=85] 
Out[10]:
Jane    90
Elon    85
Tim     95
dtype: int64
In [11]:
names["Tom"]=60
names
Out[11]:
Jane    90
Bill    80
Elon    85
Tom     60
Tim     95
dtype: int64
In [12]:
names[names<=80]=83
names
Out[12]:
Jane    90
Bill    83
Elon    85
Tom     83
Tim     95
dtype: int64
In [13]:
"Tom" in names
Out[13]:
True
In [14]:
names/10 
Out[14]:
Jane    9.0
Bill    8.3
Elon    8.5
Tom     8.3
Tim     9.5
dtype: float64
In [15]:
names**2
Out[15]:
Jane    8100
Bill    6889
Elon    7225
Tom     6889
Tim     9025
dtype: int64
In [16]:
names.isnull() 
Out[16]:
Jane    False
Bill    False
Elon    False
Tom     False
Tim     False
dtype: bool

Working with Series Data Structure¶

In [18]:
games=pd.read_csv("https://raw.githubusercontent.com/TirendazAcademy/PANDAS-TUTORIAL/refs/heads/main/DataSets/vgsalesGlobale.csv")
In [19]:
games.head()
Out[19]:
Rank Name Platform Year Genre Publisher NA_Sales EU_Sales JP_Sales Other_Sales Global_Sales
0 1 Wii Sports Wii 2006.0 Sports Nintendo 41.49 29.02 3.77 8.46 82.74
1 2 Super Mario Bros. NES 1985.0 Platform Nintendo 29.08 3.58 6.81 0.77 40.24
2 3 Mario Kart Wii Wii 2008.0 Racing Nintendo 15.85 12.88 3.79 3.31 35.82
3 4 Wii Sports Resort Wii 2009.0 Sports Nintendo 15.75 11.01 3.28 2.96 33.00
4 5 Pokemon Red/Pokemon Blue GB 1996.0 Role-Playing Nintendo 11.27 8.89 10.22 1.00 31.37
In [20]:
games.dtypes
Out[20]:
Rank              int64
Name             object
Platform         object
Year            float64
Genre            object
Publisher        object
NA_Sales        float64
EU_Sales        float64
JP_Sales        float64
Other_Sales     float64
Global_Sales    float64
dtype: object
In [21]:
games.Genre.describe()
Out[21]:
count      16598
unique        12
top       Action
freq        3316
Name: Genre, dtype: object
In [22]:
games.Genre.value_counts() 
Out[22]:
Genre
Action          3316
Sports          2346
Misc            1739
Role-Playing    1488
Shooter         1310
Adventure       1286
Racing          1249
Platform         886
Simulation       867
Fighting         848
Strategy         681
Puzzle           582
Name: count, dtype: int64
In [23]:
games.Genre.value_counts(normalize=True) 
Out[23]:
Genre
Action          0.199783
Sports          0.141342
Misc            0.104772
Role-Playing    0.089649
Shooter         0.078925
Adventure       0.077479
Racing          0.075250
Platform        0.053380
Simulation      0.052235
Fighting        0.051090
Strategy        0.041029
Puzzle          0.035064
Name: proportion, dtype: float64
In [24]:
type(games.Genre.value_counts())
Out[24]:
pandas.core.series.Series
In [26]:
games.Genre.unique()
Out[26]:
array(['Sports', 'Platform', 'Racing', 'Role-Playing', 'Puzzle', 'Misc',
       'Shooter', 'Simulation', 'Action', 'Fighting', 'Adventure',
       'Strategy'], dtype=object)
In [27]:
games.Genre.nunique()
Out[27]:
12
In [28]:
pd.crosstab(games.Genre, games.Year)
Out[28]:
Year 1980.0 1981.0 1982.0 1983.0 1984.0 1985.0 1986.0 1987.0 1988.0 1989.0 ... 2009.0 2010.0 2011.0 2012.0 2013.0 2014.0 2015.0 2016.0 2017.0 2020.0
Genre
Action 1 25 18 7 1 2 6 2 2 2 ... 272 226 239 266 148 186 255 119 1 0
Adventure 0 0 0 1 0 0 0 1 0 0 ... 141 154 108 58 60 75 54 34 0 0
Fighting 1 0 0 0 0 1 0 2 0 0 ... 53 40 50 29 20 23 21 14 0 0
Misc 4 0 1 1 1 0 0 0 0 1 ... 207 201 184 38 42 41 39 18 0 0
Platform 0 3 5 5 1 4 6 2 4 3 ... 29 31 37 12 37 10 14 10 0 0
Puzzle 0 2 3 1 3 4 0 0 1 5 ... 79 45 43 11 3 8 6 0 0 0
Racing 0 1 2 0 3 0 1 0 1 0 ... 84 57 65 30 16 27 19 20 0 0
Role-Playing 0 0 0 0 0 0 1 3 3 2 ... 103 103 95 78 71 91 78 40 2 0
Shooter 2 10 5 1 3 1 4 2 1 1 ... 91 81 94 48 59 47 34 32 0 0
Simulation 0 1 0 0 0 1 0 0 1 0 ... 123 82 56 18 18 11 15 9 0 1
Sports 1 4 2 1 2 1 3 4 2 3 ... 184 186 122 54 53 55 62 38 0 0
Strategy 0 0 0 0 0 0 0 0 0 0 ... 65 53 46 15 19 8 17 10 0 0

12 rows × 39 columns

In [29]:
games.Global_Sales.describe()
Out[29]:
count    16598.000000
mean         0.537441
std          1.555028
min          0.010000
25%          0.060000
50%          0.170000
75%          0.470000
max         82.740000
Name: Global_Sales, dtype: float64
In [31]:
print(games.Global_Sales.mean())

print(games.Global_Sales.median())

print(games.Global_Sales.std())

print(games.Global_Sales.max())
0.5374406555006628
0.17
1.5550279355699124
82.74
In [32]:
games.Global_Sales.value_counts()
Out[32]:
Global_Sales
0.02    1071
0.03     811
0.04     645
0.05     632
0.01     618
        ... 
5.01       1
5.05       1
5.07       1
5.11       1
3.16       1
Name: count, Length: 623, dtype: int64
In [35]:
games.Year.plot(kind="hist")
Out[35]:
<Axes: ylabel='Frequency'>
No description has been provided for this image
In [36]:
games.Year.plot(kind="box")
Out[36]:
<Axes: >
No description has been provided for this image
In [37]:
games.Year.plot(kind="kde")
Out[37]:
<Axes: ylabel='Density'>
No description has been provided for this image
In [39]:
games.Genre.value_counts().plot(kind="bar")
Out[39]:
<Axes: xlabel='Genre'>
No description has been provided for this image

DataFrame Data Structure¶

DataFrame is a two-dimensional data structure that can hold data of different types. It is similar to a table with rows and columns.

In [40]:
data={"name":["Bill","Tom","Tim","John","Alex","Vanessa","Kate"],      
      "score":[90,80,85,75,95,60,65],      
      "sport":["Wrestling","Football","Skiing","Swimming","Tennis",
               "Karete","Surfing"],      
      "sex":["M","M","M","M","F","F","F"]}

df=pd.DataFrame(data)
df
Out[40]:
name score sport sex
0 Bill 90 Wrestling M
1 Tom 80 Football M
2 Tim 85 Skiing M
3 John 75 Swimming M
4 Alex 95 Tennis F
5 Vanessa 60 Karete F
6 Kate 65 Surfing F
In [41]:
df=pd.DataFrame(data,columns=["name","sport","sex","score"])
df
Out[41]:
name sport sex score
0 Bill Wrestling M 90
1 Tom Football M 80
2 Tim Skiing M 85
3 John Swimming M 75
4 Alex Tennis F 95
5 Vanessa Karete F 60
6 Kate Surfing F 65
In [42]:
df=pd.DataFrame(data,columns=["name", "sport", "gender", "score", "age"],
                index=["one","two","three","four","five","six","seven"])
df
Out[42]:
name sport gender score age
one Bill Wrestling NaN 90 NaN
two Tom Football NaN 80 NaN
three Tim Skiing NaN 85 NaN
four John Swimming NaN 75 NaN
five Alex Tennis NaN 95 NaN
six Vanessa Karete NaN 60 NaN
seven Kate Surfing NaN 65 NaN
In [43]:
df["sport"]
Out[43]:
one      Wrestling
two       Football
three       Skiing
four      Swimming
five        Tennis
six         Karete
seven      Surfing
Name: sport, dtype: object
In [44]:
my_columns=["name","sport"]
df[my_columns]
Out[44]:
name sport
one Bill Wrestling
two Tom Football
three Tim Skiing
four John Swimming
five Alex Tennis
six Vanessa Karete
seven Kate Surfing
In [45]:
df.sport
Out[45]:
one      Wrestling
two       Football
three       Skiing
four      Swimming
five        Tennis
six         Karete
seven      Surfing
Name: sport, dtype: object
In [46]:
df.loc[["one"]]
Out[46]:
name sport gender score age
one Bill Wrestling NaN 90 NaN
In [47]:
df.loc[["one","two"]]
Out[47]:
name sport gender score age
one Bill Wrestling NaN 90 NaN
two Tom Football NaN 80 NaN
In [48]:
df["age"]=18
In [49]:
df=pd.DataFrame(data,columns=["name", "sport", "gender", "score", "age"], 
                index=["one","two","three","four","five","six","seven"])
values=[18,19,20,18,17,17,18]
df["age"]=values
df
Out[49]:
name sport gender score age
one Bill Wrestling NaN 90 18
two Tom Football NaN 80 19
three Tim Skiing NaN 85 20
four John Swimming NaN 75 18
five Alex Tennis NaN 95 17
six Vanessa Karete NaN 60 17
seven Kate Surfing NaN 65 18
In [50]:
df["pass"]=df.score>=70
df
Out[50]:
name sport gender score age pass
one Bill Wrestling NaN 90 18 True
two Tom Football NaN 80 19 True
three Tim Skiing NaN 85 20 True
four John Swimming NaN 75 18 True
five Alex Tennis NaN 95 17 True
six Vanessa Karete NaN 60 17 False
seven Kate Surfing NaN 65 18 False
In [51]:
del df["pass"]
df
Out[51]:
name sport gender score age
one Bill Wrestling NaN 90 18
two Tom Football NaN 80 19
three Tim Skiing NaN 85 20
four John Swimming NaN 75 18
five Alex Tennis NaN 95 17
six Vanessa Karete NaN 60 17
seven Kate Surfing NaN 65 18
In [52]:
scores={"Math":{"A":85,"B":90,"C":95}, "Physics":{"A":90,"B":80,"C":75}}
In [53]:
scores_df=pd.DataFrame(scores)
scores_df
Out[53]:
Math Physics
A 85 90
B 90 80
C 95 75
In [54]:
scores_df.T
Out[54]:
A B C
Math 85 90 95
Physics 90 80 75
In [55]:
scores_df.index.name="name"
scores_df.columns.name="lesson"
scores_df
Out[55]:
lesson Math Physics
name
A 85 90
B 90 80
C 95 75
In [56]:
scores_df.values
Out[56]:
array([[85, 90],
       [90, 80],
       [95, 75]])
In [57]:
scores_index=scores_df.index
In [58]:
scores_index[1]="Jack"
scores_index
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
Cell In[58], line 1
----> 1 scores_index[1]="Jack"
      2 scores_index

File ~/work/AI/blog/.venv/lib/python3.12/site-packages/pandas/core/indexes/base.py:5371, in Index.__setitem__(self, key, value)
   5369 @final
   5370 def __setitem__(self, key, value) -> None:
-> 5371     raise TypeError("Index does not support mutable operations")

TypeError: Index does not support mutable operations

Indexing & Selection & Filtering¶

In [59]:
import numpy as np
In [61]:
obj=pd.Series(np.arange(5),
              index=["a","b","c","d","e"])
obj
Out[61]:
a    0
b    1
c    2
d    3
e    4
dtype: int64
In [62]:
obj["c"]
Out[62]:
2
In [63]:
obj[2]
/var/folders/59/c32_bthx48jd9m2ym5m3tnpw0000j7/T/ipykernel_18768/1662947756.py:1: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
  obj[2]
Out[63]:
2
In [64]:
obj[0:3]
Out[64]:
a    0
b    1
c    2
dtype: int64
In [65]:
obj[["a","c"]]
Out[65]:
a    0
c    2
dtype: int64
In [66]:
obj[[0,2]]
/var/folders/59/c32_bthx48jd9m2ym5m3tnpw0000j7/T/ipykernel_18768/1746387968.py:1: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
  obj[[0,2]]
Out[66]:
a    0
c    2
dtype: int64
In [67]:
obj[obj<2]
Out[67]:
a    0
b    1
dtype: int64
In [68]:
obj["a":"c"]
Out[68]:
a    0
b    1
c    2
dtype: int64
In [69]:
obj["b":"c"]=5
obj
Out[69]:
a    0
b    5
c    5
d    3
e    4
dtype: int64
In [70]:
data=pd.DataFrame(
    np.arange(16).reshape(4,4),
    index=["London","Paris",
           "Berlin","Istanbul"],
    columns=["one","two","three","four"])
data
Out[70]:
one two three four
London 0 1 2 3
Paris 4 5 6 7
Berlin 8 9 10 11
Istanbul 12 13 14 15
In [71]:
data["two"]
Out[71]:
London       1
Paris        5
Berlin       9
Istanbul    13
Name: two, dtype: int64
In [72]:
data[["one","two"]]
Out[72]:
one two
London 0 1
Paris 4 5
Berlin 8 9
Istanbul 12 13
In [73]:
data[:3]
Out[73]:
one two three four
London 0 1 2 3
Paris 4 5 6 7
Berlin 8 9 10 11
In [74]:
data[data["four"]>5]
Out[74]:
one two three four
Paris 4 5 6 7
Berlin 8 9 10 11
Istanbul 12 13 14 15
In [75]:
data[data<5]=0
data
Out[75]:
one two three four
London 0 0 0 0
Paris 0 5 6 7
Berlin 8 9 10 11
Istanbul 12 13 14 15
In [76]:
data.iloc[1]
Out[76]:
one      0
two      5
three    6
four     7
Name: Paris, dtype: int64
In [77]:
data.iloc[1,[1,2,3]]
Out[77]:
two      5
three    6
four     7
Name: Paris, dtype: int64
In [78]:
data.loc["Paris",["one","two"]]
Out[78]:
one    0
two    5
Name: Paris, dtype: int64
In [79]:
data.loc[:"Paris","four"]
Out[79]:
London    0
Paris     7
Name: four, dtype: int64
In [80]:
toy_data=pd.Series(np.arange(5),
                   index=["a","b","c",
                          "d","e"])
toy_data
Out[80]:
a    0
b    1
c    2
d    3
e    4
dtype: int64
In [81]:
toy_data[-1]
/var/folders/59/c32_bthx48jd9m2ym5m3tnpw0000j7/T/ipykernel_18768/3728369251.py:1: FutureWarning: Series.__getitem__ treating keys as positions is deprecated. In a future version, integer keys will always be treated as labels (consistent with DataFrame behavior). To access a value by position, use `ser.iloc[pos]`
  toy_data[-1]
Out[81]:
4

Useful Methods¶

In [82]:
s=pd.Series([1,2,3,4],
            index=["a","b","c","d"])
s
Out[82]:
a    1
b    2
c    3
d    4
dtype: int64
In [83]:
s2=s.reindex(["b","d","a","c","e"])
s2
Out[83]:
b    2.0
d    4.0
a    1.0
c    3.0
e    NaN
dtype: float64
In [84]:
s3=pd.Series(["blue","yellow","purple"],
             index=[0,2,4])
s3
Out[84]:
0      blue
2    yellow
4    purple
dtype: object
In [85]:
s3.reindex(range(6),method="ffill")
Out[85]:
0      blue
1      blue
2    yellow
3    yellow
4    purple
5    purple
dtype: object
In [86]:
df=pd.DataFrame(np.arange(9).reshape(3,3),
                index=["a","c","d"],
                columns=["Tim","Tom","Kate"])
df
Out[86]:
Tim Tom Kate
a 0 1 2
c 3 4 5
d 6 7 8
In [87]:
df2=df.reindex(["d","c","b","a"])
df2
Out[87]:
Tim Tom Kate
d 6.0 7.0 8.0
c 3.0 4.0 5.0
b NaN NaN NaN
a 0.0 1.0 2.0
In [88]:
names=["Kate","Tim","Tom"]
df.reindex(columns=names)
Out[88]:
Kate Tim Tom
a 2 0 1
c 5 3 4
d 8 6 7
In [89]:
df.loc[["c","d","a"]]
Out[89]:
Tim Tom Kate
c 3 4 5
d 6 7 8
a 0 1 2
In [90]:
s=pd.Series(np.arange(5.),
            index=["a","b","c","d","e"])
s
Out[90]:
a    0.0
b    1.0
c    2.0
d    3.0
e    4.0
dtype: float64
In [91]:
new_s=s.drop("b")
new_s
Out[91]:
a    0.0
c    2.0
d    3.0
e    4.0
dtype: float64
In [92]:
s.drop(["c","d"])
Out[92]:
a    0.0
b    1.0
e    4.0
dtype: float64
In [93]:
data=pd.DataFrame(np.arange(16).reshape(4,4),
                  index=["Kate","Tim",
                         "Tom","Alex"],
                  columns=list("ABCD"))
data
Out[93]:
A B C D
Kate 0 1 2 3
Tim 4 5 6 7
Tom 8 9 10 11
Alex 12 13 14 15
In [94]:
data.drop(["Kate","Tim"])
Out[94]:
A B C D
Tom 8 9 10 11
Alex 12 13 14 15
In [95]:
data.drop("A",axis=1)
Out[95]:
B C D
Kate 1 2 3
Tim 5 6 7
Tom 9 10 11
Alex 13 14 15
In [96]:
data.drop("Kate",axis=0)
Out[96]:
A B C D
Tim 4 5 6 7
Tom 8 9 10 11
Alex 12 13 14 15
In [97]:
data
Out[97]:
A B C D
Kate 0 1 2 3
Tim 4 5 6 7
Tom 8 9 10 11
Alex 12 13 14 15
In [98]:
data.mean(axis="index")
Out[98]:
A    6.0
B    7.0
C    8.0
D    9.0
dtype: float64
In [100]:
data.mean(axis="columns")
Out[100]:
Kate     1.5
Tim      5.5
Tom      9.5
Alex    13.5
dtype: float64
In [101]:
data.mean(axis=None)
Out[101]:
7.5

Arithmetic Operations¶

In [102]:
s1=pd.Series(np.arange(4),
             index=["a","c","d","e"])
s2=pd.Series(np.arange(5),
             index=["a","c","e","f","g"])
In [103]:
print(s1)
print(s2)
a    0
c    1
d    2
e    3
dtype: int64
a    0
c    1
e    2
f    3
g    4
dtype: int64
In [104]:
s1+s2
Out[104]:
a    0.0
c    2.0
d    NaN
e    5.0
f    NaN
g    NaN
dtype: float64
In [105]:
df1=pd.DataFrame(
    np.arange(6).reshape(2,3),
    columns=list("ABC"),
    index=["Tim","Tom"])
df2=pd.DataFrame(
    np.arange(9).reshape(3,3),
    columns=list("ACD"),
    index=["Tim","Kate","Tom"])
In [106]:
print(df1)
print(df2)
     A  B  C
Tim  0  1  2
Tom  3  4  5
      A  C  D
Tim   0  1  2
Kate  3  4  5
Tom   6  7  8
In [107]:
df1+df2
Out[107]:
A B C D
Kate NaN NaN NaN NaN
Tim 0.0 NaN 3.0 NaN
Tom 9.0 NaN 12.0 NaN
In [108]:
df1.add(df2,fill_value=0)
Out[108]:
A B C D
Kate 3.0 NaN 4.0 5.0
Tim 0.0 1.0 3.0 2.0
Tom 9.0 4.0 12.0 8.0
In [109]:
df1
Out[109]:
A B C
Tim 0 1 2
Tom 3 4 5
In [111]:
1/df1
Out[111]:
A B C
Tim inf 1.00 0.5
Tom 0.333333 0.25 0.2
In [112]:
df1/2
Out[112]:
A B C
Tim 0.0 0.5 1.0
Tom 1.5 2.0 2.5
In [113]:
s=df2.iloc[1]
s
Out[113]:
A    3
C    4
D    5
Name: Kate, dtype: int64
In [114]:
df2
Out[114]:
A C D
Tim 0 1 2
Kate 3 4 5
Tom 6 7 8
In [115]:
df2-s
Out[115]:
A C D
Tim -3 -3 -3
Kate 0 0 0
Tom 3 3 3
In [116]:
s2=df2["A"]
s2
Out[116]:
Tim     0
Kate    3
Tom     6
Name: A, dtype: int64
In [117]:
df2.sub(s2,axis="index")
Out[117]:
A C D
Tim 0 1 2
Kate 0 1 2
Tom 0 1 2
In [118]:
df2
Out[118]:
A C D
Tim 0 1 2
Kate 3 4 5
Tom 6 7 8

Applying a Function¶

In [119]:
df=pd.DataFrame(
    np.random.randn(4,3),
    columns=list("ABC"),
    index=["Kim","Susan","Tim","Tom"])
df
Out[119]:
A B C
Kim 2.554629 -1.113764 0.968447
Susan 0.596522 -0.653082 0.068941
Tim -1.996567 -1.629866 1.012815
Tom -0.250421 -1.260170 0.384344
In [120]:
np.abs(df)
Out[120]:
A B C
Kim 2.554629 1.113764 0.968447
Susan 0.596522 0.653082 0.068941
Tim 1.996567 1.629866 1.012815
Tom 0.250421 1.260170 0.384344
In [121]:
f=lambda x:x.max()-x.min()
In [122]:
df.apply(f)
Out[122]:
A    4.551196
B    0.976784
C    0.943874
dtype: float64
In [123]:
df.apply(f,axis=1)
Out[123]:
Kim      3.668393
Susan    1.249605
Tim      3.009383
Tom      1.644514
dtype: float64
In [124]:
def f(x):
    return x**2
In [125]:
df.apply(f)
Out[125]:
A B C
Kim 6.526127 1.240471 0.937890
Susan 0.355839 0.426516 0.004753
Tim 3.986281 2.656463 1.025795
Tom 0.062710 1.588027 0.147720

Sorting & Ranking¶

In [3]:
s=pd.Series(range(5),
            index=["e","d","a","b","c"])
s
Out[3]:
e    0
d    1
a    2
b    3
c    4
dtype: int64
In [4]:
s.sort_index()
Out[4]:
a    2
b    3
c    4
d    1
e    0
dtype: int64
In [7]:
s.sort_index()
Out[7]:
a    2
b    3
c    4
d    1
e    0
dtype: int64
In [8]:
df=pd.DataFrame(
    np.arange(12).reshape(3,4),
    index=["two","one","three"],
    columns=["d","a","b","c"])
df
Out[8]:
d a b c
two 0 1 2 3
one 4 5 6 7
three 8 9 10 11
In [11]:
df.sort_index()
Out[11]:
d a b c
one 4 5 6 7
three 8 9 10 11
two 0 1 2 3

Reference¶

  • Pandas Documentation
  • Pandas Tutorial
In [ ]: