Python Pandas Function and Syntax 1 (Filtering and Sorting)

 

  • .values: A two-dimensional NumPy array of values.
  • .columns: An index of columns: the column names.
  • .index: An index for the rows: either row numbers or row
#print(homelessness.values)

# Print the column index of homelessness
print(homelessness.columns)

# Print the row index of homelessness
print(homelessness.index)

Sorting in Pandas

one columndf.sort_values("breed")
multiple columnsdf.sort_values(["breed", "weight_kg"])

homelessness_reg_fam = homelessness.sort_values(["region", "family_members"], ascending=[True, False])


Subsetting Rows

dogs[dogs["height_cm"] > 60]
dogs[dogs["color"] == "tan"]

You can filter for multiple conditions at once by using the "logical and" operator, &.

dogs[(dogs["height_cm"] > 60) & (dogs["col_b"] == "tan")]

homelessness is available and pandas is loaded as pd.

fam_lt_1k_pac = homelessness[(homelessness['family_members'] < 1000) & (homelessness['region'] == 'Pacific')]


Subsetting rows by categorical variables


Subsetting data based on a categorical variable often involves using the "or" operator (|) to select rows from multiple categories. This can get tedious when you want all states in one of three different regions, for example. Instead, use the .isin() method, which will allow you to tackle this problem by writing one condition instead of three separate ones.

colors = ["brown", "black", "tan"]
condition = dogs["color"].isin(colors)
dogs[condition]

Eg - mojave_homelessness = homelessness[homelessness['state'].isin(canu)]



Comments

Popular posts from this blog

Binomial Test in Python

Python Syntax and Functions Part2 (Summary Statistics)

Slicing and Indexing in Python Pandas