{
"cells": [
{
"cell_type": "markdown",
"id": "40af5846",
"metadata": {},
"source": [
"# Knjižnica Pandas\n",
"\n",
"Spodaj je pregled najosnovnejših metod, ki jih ponuja knjižnica Pandas. Vsaka od naštetih metod ponuja še cel kup dodatnih možnosti, ki so natančno opisane v [uradni dokumentaciji](http://pandas.pydata.org/pandas-docs/stable/). Z branjem dokumentacije se vam seveda najbolj splača začeti pri [uvodih](http://pandas.pydata.org/pandas-docs/stable/tutorials.html).\n",
"\n",
"## Predpriprava"
]
},
{
"cell_type": "code",
"execution_count": 1,
"id": "f5d8a416",
"metadata": {},
"outputs": [],
"source": [
"# naložimo paket\n",
"import pandas as pd\n",
"\n",
"# naložimo razpredelnico, s katero bomo delali\n",
"filmi = pd.read_csv('podatki/filmi.csv', index_col='id')\n",
"\n",
"# ker bomo delali z velikimi razpredelnicami, povemo, da naj se vedno izpiše le 20 vrstic\n",
"pd.options.display.max_rows = 20"
]
},
{
"cell_type": "markdown",
"id": "b1719851",
"metadata": {},
"source": [
"## Osnovni izbori elementov razpredelnic\n",
"\n",
"Z metodo `.head(n=5)` pogledamo prvih `n`, z metodo `.tail(n=5)` pa zadnjih `n` vrstic razpredelnice."
]
},
{
"cell_type": "code",
"execution_count": 2,
"id": "dbc7384e",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"
\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" naslov | \n",
" dolzina | \n",
" leto | \n",
" ocena | \n",
" metascore | \n",
" glasovi | \n",
" zasluzek | \n",
" oznaka | \n",
" opis | \n",
"
\n",
" \n",
" id | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 4972 | \n",
" The Birth of a Nation | \n",
" 195 | \n",
" 1915 | \n",
" 6.2 | \n",
" NaN | \n",
" 24890 | \n",
" 10000000.0 | \n",
" TV-PG | \n",
" The Stoneman family finds its friendship with ... | \n",
"
\n",
" \n",
" 6864 | \n",
" Intolerance | \n",
" 197 | \n",
" 1916 | \n",
" 7.7 | \n",
" 99.0 | \n",
" 15670 | \n",
" 2180000.0 | \n",
" Passed | \n",
" The story of a poor young woman separated by p... | \n",
"
\n",
" \n",
" 9968 | \n",
" Broken Blossoms | \n",
" 90 | \n",
" 1919 | \n",
" 7.3 | \n",
" NaN | \n",
" 10423 | \n",
" NaN | \n",
" Not Rated | \n",
" A frail waif, abused by her brutal boxer fathe... | \n",
"
\n",
" \n",
" 10323 | \n",
" The Cabinet of Dr. Caligari | \n",
" 67 | \n",
" 1920 | \n",
" 8.0 | \n",
" NaN | \n",
" 64133 | \n",
" NaN | \n",
" Not Rated | \n",
" Hypnotist Dr. Caligari uses a somnambulist, Ce... | \n",
"
\n",
" \n",
" 12349 | \n",
" The Kid | \n",
" 68 | \n",
" 1921 | \n",
" 8.3 | \n",
" NaN | \n",
" 126513 | \n",
" 5450000.0 | \n",
" Passed | \n",
" The Tramp cares for an abandoned child, but ev... | \n",
"
\n",
" \n",
" 12364 | \n",
" The Phantom Carriage | \n",
" 107 | \n",
" 1921 | \n",
" 8.0 | \n",
" NaN | \n",
" 12624 | \n",
" NaN | \n",
" Not Rated | \n",
" On New Year's Eve, the driver of a ghostly car... | \n",
"
\n",
" \n",
" 13442 | \n",
" Nosferatu | \n",
" 94 | \n",
" 1922 | \n",
" 7.9 | \n",
" NaN | \n",
" 97589 | \n",
" NaN | \n",
" Not Rated | \n",
" Vampire Count Orlok expresses interest in a ne... | \n",
"
\n",
" \n",
" 14341 | \n",
" Our Hospitality | \n",
" 65 | \n",
" 1923 | \n",
" 7.8 | \n",
" NaN | \n",
" 11428 | \n",
" 1172499.0 | \n",
" Passed | \n",
" A man returns to his Appalachian homestead. On... | \n",
"
\n",
" \n",
" 14429 | \n",
" Safety Last! | \n",
" 74 | \n",
" 1923 | \n",
" 8.1 | \n",
" NaN | \n",
" 20887 | \n",
" 1359903.0 | \n",
" Not Rated | \n",
" A boy leaves his small country town and heads ... | \n",
"
\n",
" \n",
" 15064 | \n",
" The Last Laugh | \n",
" 90 | \n",
" 1924 | \n",
" 8.0 | \n",
" NaN | \n",
" 14150 | \n",
" 94812.0 | \n",
" Not Rated | \n",
" An aging doorman is forced to face the scorn o... | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" naslov dolzina leto ocena metascore glasovi \\\n",
"id \n",
"4972 The Birth of a Nation 195 1915 6.2 NaN 24890 \n",
"6864 Intolerance 197 1916 7.7 99.0 15670 \n",
"9968 Broken Blossoms 90 1919 7.3 NaN 10423 \n",
"10323 The Cabinet of Dr. Caligari 67 1920 8.0 NaN 64133 \n",
"12349 The Kid 68 1921 8.3 NaN 126513 \n",
"12364 The Phantom Carriage 107 1921 8.0 NaN 12624 \n",
"13442 Nosferatu 94 1922 7.9 NaN 97589 \n",
"14341 Our Hospitality 65 1923 7.8 NaN 11428 \n",
"14429 Safety Last! 74 1923 8.1 NaN 20887 \n",
"15064 The Last Laugh 90 1924 8.0 NaN 14150 \n",
"\n",
" zasluzek oznaka \\\n",
"id \n",
"4972 10000000.0 TV-PG \n",
"6864 2180000.0 Passed \n",
"9968 NaN Not Rated \n",
"10323 NaN Not Rated \n",
"12349 5450000.0 Passed \n",
"12364 NaN Not Rated \n",
"13442 NaN Not Rated \n",
"14341 1172499.0 Passed \n",
"14429 1359903.0 Not Rated \n",
"15064 94812.0 Not Rated \n",
"\n",
" opis \n",
"id \n",
"4972 The Stoneman family finds its friendship with ... \n",
"6864 The story of a poor young woman separated by p... \n",
"9968 A frail waif, abused by her brutal boxer fathe... \n",
"10323 Hypnotist Dr. Caligari uses a somnambulist, Ce... \n",
"12349 The Tramp cares for an abandoned child, but ev... \n",
"12364 On New Year's Eve, the driver of a ghostly car... \n",
"13442 Vampire Count Orlok expresses interest in a ne... \n",
"14341 A man returns to his Appalachian homestead. On... \n",
"14429 A boy leaves his small country town and heads ... \n",
"15064 An aging doorman is forced to face the scorn o... "
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi.head(10)"
]
},
{
"cell_type": "code",
"execution_count": 3,
"id": "48aea215",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" naslov | \n",
" dolzina | \n",
" leto | \n",
" ocena | \n",
" metascore | \n",
" glasovi | \n",
" zasluzek | \n",
" oznaka | \n",
" opis | \n",
"
\n",
" \n",
" id | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 18568902 | \n",
" Kaun Pravin Tambe? | \n",
" 134 | \n",
" 2022 | \n",
" 8.4 | \n",
" NaN | \n",
" 10163 | \n",
" NaN | \n",
" NaN | \n",
" An indian cricketer who shows persistence and ... | \n",
"
\n",
" \n",
" 18689424 | \n",
" Batman v Superman: Dawn of Justice - Ultimate ... | \n",
" 182 | \n",
" 2016 | \n",
" 7.1 | \n",
" NaN | \n",
" 57662 | \n",
" NaN | \n",
" R | \n",
" Batman is manipulated by Lex Luthor to fear Su... | \n",
"
\n",
" \n",
" 18968540 | \n",
" Incantation | \n",
" 110 | \n",
" 2022 | \n",
" 6.2 | \n",
" NaN | \n",
" 12366 | \n",
" NaN | \n",
" TV-MA | \n",
" Six years ago, Li Ronan was cursed after break... | \n",
"
\n",
" \n",
" 20850406 | \n",
" Sita Ramam | \n",
" 163 | \n",
" 2022 | \n",
" 8.5 | \n",
" NaN | \n",
" 38490 | \n",
" NaN | \n",
" NaN | \n",
" An orphan soldier, Lieutenant Ram's life chang... | \n",
"
\n",
" \n",
" 21279138 | \n",
" Maid in Malacañang | \n",
" 114 | \n",
" 2022 | \n",
" 3.9 | \n",
" NaN | \n",
" 15273 | \n",
" NaN | \n",
" NaN | \n",
" The Last Days of Ferdinand and Imelda Marcos t... | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" naslov dolzina leto \\\n",
"id \n",
"18568902 Kaun Pravin Tambe? 134 2022 \n",
"18689424 Batman v Superman: Dawn of Justice - Ultimate ... 182 2016 \n",
"18968540 Incantation 110 2022 \n",
"20850406 Sita Ramam 163 2022 \n",
"21279138 Maid in Malacañang 114 2022 \n",
"\n",
" ocena metascore glasovi zasluzek oznaka \\\n",
"id \n",
"18568902 8.4 NaN 10163 NaN NaN \n",
"18689424 7.1 NaN 57662 NaN R \n",
"18968540 6.2 NaN 12366 NaN TV-MA \n",
"20850406 8.5 NaN 38490 NaN NaN \n",
"21279138 3.9 NaN 15273 NaN NaN \n",
"\n",
" opis \n",
"id \n",
"18568902 An indian cricketer who shows persistence and ... \n",
"18689424 Batman is manipulated by Lex Luthor to fear Su... \n",
"18968540 Six years ago, Li Ronan was cursed after break... \n",
"20850406 An orphan soldier, Lieutenant Ram's life chang... \n",
"21279138 The Last Days of Ferdinand and Imelda Marcos t... "
]
},
"execution_count": 3,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi.tail()"
]
},
{
"cell_type": "markdown",
"id": "f3993aa4",
"metadata": {},
"source": [
"Z rezinami pa dostopamo do izbranih vrstic."
]
},
{
"cell_type": "code",
"execution_count": 4,
"id": "3f8575e8",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" naslov | \n",
" dolzina | \n",
" leto | \n",
" ocena | \n",
" metascore | \n",
" glasovi | \n",
" zasluzek | \n",
" oznaka | \n",
" opis | \n",
"
\n",
" \n",
" id | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 10323 | \n",
" The Cabinet of Dr. Caligari | \n",
" 67 | \n",
" 1920 | \n",
" 8.0 | \n",
" NaN | \n",
" 64133 | \n",
" NaN | \n",
" Not Rated | \n",
" Hypnotist Dr. Caligari uses a somnambulist, Ce... | \n",
"
\n",
" \n",
" 12364 | \n",
" The Phantom Carriage | \n",
" 107 | \n",
" 1921 | \n",
" 8.0 | \n",
" NaN | \n",
" 12624 | \n",
" NaN | \n",
" Not Rated | \n",
" On New Year's Eve, the driver of a ghostly car... | \n",
"
\n",
" \n",
" 14341 | \n",
" Our Hospitality | \n",
" 65 | \n",
" 1923 | \n",
" 7.8 | \n",
" NaN | \n",
" 11428 | \n",
" 1172499.0 | \n",
" Passed | \n",
" A man returns to his Appalachian homestead. On... | \n",
"
\n",
" \n",
" 15064 | \n",
" The Last Laugh | \n",
" 90 | \n",
" 1924 | \n",
" 8.0 | \n",
" NaN | \n",
" 14150 | \n",
" 94812.0 | \n",
" Not Rated | \n",
" An aging doorman is forced to face the scorn o... | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" naslov dolzina leto ocena metascore glasovi \\\n",
"id \n",
"10323 The Cabinet of Dr. Caligari 67 1920 8.0 NaN 64133 \n",
"12364 The Phantom Carriage 107 1921 8.0 NaN 12624 \n",
"14341 Our Hospitality 65 1923 7.8 NaN 11428 \n",
"15064 The Last Laugh 90 1924 8.0 NaN 14150 \n",
"\n",
" zasluzek oznaka opis \n",
"id \n",
"10323 NaN Not Rated Hypnotist Dr. Caligari uses a somnambulist, Ce... \n",
"12364 NaN Not Rated On New Year's Eve, the driver of a ghostly car... \n",
"14341 1172499.0 Passed A man returns to his Appalachian homestead. On... \n",
"15064 94812.0 Not Rated An aging doorman is forced to face the scorn o... "
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi[3:10:2]"
]
},
{
"cell_type": "markdown",
"id": "671eb18a",
"metadata": {},
"source": [
"Z indeksiranjem razpredelnice dostopamo do posameznih stolpcev."
]
},
{
"cell_type": "code",
"execution_count": 5,
"id": "24345646",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"id\n",
"4972 6.2\n",
"6864 7.7\n",
"9968 7.3\n",
"10323 8.0\n",
"12349 8.3\n",
" ... \n",
"18568902 8.4\n",
"18689424 7.1\n",
"18968540 6.2\n",
"20850406 8.5\n",
"21279138 3.9\n",
"Name: ocena, Length: 9999, dtype: float64"
]
},
"execution_count": 5,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi['ocena']"
]
},
{
"cell_type": "markdown",
"id": "1848e350",
"metadata": {},
"source": [
"Do stolpcev pogosto dostopamo, zato lahko uporabimo tudi krajši zapis."
]
},
{
"cell_type": "code",
"execution_count": 6,
"id": "981f233b",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"id\n",
"4972 6.2\n",
"6864 7.7\n",
"9968 7.3\n",
"10323 8.0\n",
"12349 8.3\n",
" ... \n",
"18568902 8.4\n",
"18689424 7.1\n",
"18968540 6.2\n",
"20850406 8.5\n",
"21279138 3.9\n",
"Name: ocena, Length: 9999, dtype: float64"
]
},
"execution_count": 6,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi.ocena"
]
},
{
"cell_type": "markdown",
"id": "e82b589a",
"metadata": {},
"source": [
"Če želimo več stolpcev, moramo za indeks podati seznam vseh oznak."
]
},
{
"cell_type": "code",
"execution_count": 7,
"id": "bdd04c83",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" naslov | \n",
" ocena | \n",
"
\n",
" \n",
" id | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 4972 | \n",
" The Birth of a Nation | \n",
" 6.2 | \n",
"
\n",
" \n",
" 6864 | \n",
" Intolerance | \n",
" 7.7 | \n",
"
\n",
" \n",
" 9968 | \n",
" Broken Blossoms | \n",
" 7.3 | \n",
"
\n",
" \n",
" 10323 | \n",
" The Cabinet of Dr. Caligari | \n",
" 8.0 | \n",
"
\n",
" \n",
" 12349 | \n",
" The Kid | \n",
" 8.3 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 18568902 | \n",
" Kaun Pravin Tambe? | \n",
" 8.4 | \n",
"
\n",
" \n",
" 18689424 | \n",
" Batman v Superman: Dawn of Justice - Ultimate ... | \n",
" 7.1 | \n",
"
\n",
" \n",
" 18968540 | \n",
" Incantation | \n",
" 6.2 | \n",
"
\n",
" \n",
" 20850406 | \n",
" Sita Ramam | \n",
" 8.5 | \n",
"
\n",
" \n",
" 21279138 | \n",
" Maid in Malacañang | \n",
" 3.9 | \n",
"
\n",
" \n",
"
\n",
"
9999 rows × 2 columns
\n",
"
"
],
"text/plain": [
" naslov ocena\n",
"id \n",
"4972 The Birth of a Nation 6.2\n",
"6864 Intolerance 7.7\n",
"9968 Broken Blossoms 7.3\n",
"10323 The Cabinet of Dr. Caligari 8.0\n",
"12349 The Kid 8.3\n",
"... ... ...\n",
"18568902 Kaun Pravin Tambe? 8.4\n",
"18689424 Batman v Superman: Dawn of Justice - Ultimate ... 7.1\n",
"18968540 Incantation 6.2\n",
"20850406 Sita Ramam 8.5\n",
"21279138 Maid in Malacañang 3.9\n",
"\n",
"[9999 rows x 2 columns]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi[['naslov', 'ocena']]"
]
},
{
"cell_type": "markdown",
"id": "5c53107f",
"metadata": {},
"source": [
"Do vrednosti z indeksom `i` dostopamo z `.iloc[i]`, do tiste s ključem `k` pa z `.loc[k]`."
]
},
{
"cell_type": "code",
"execution_count": 8,
"id": "32d7daeb",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"naslov Rebecca\n",
"dolzina 130\n",
"leto 1940\n",
"ocena 8.1\n",
"metascore 86.0\n",
"glasovi 137358\n",
"zasluzek 4360000.0\n",
"oznaka Approved\n",
"opis A self-conscious woman juggles adjusting to he...\n",
"Name: 32976, dtype: object"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi.iloc[120]"
]
},
{
"cell_type": "code",
"execution_count": 9,
"id": "9efdc106",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"naslov Indiana Jones and the Last Crusade\n",
"dolzina 127\n",
"leto 1989\n",
"ocena 8.2\n",
"metascore 65.0\n",
"glasovi 750654\n",
"zasluzek 197171806.0\n",
"oznaka PG-13\n",
"opis In 1938, after his father Professor Henry Jone...\n",
"Name: 97576, dtype: object"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi.loc[97576]"
]
},
{
"cell_type": "markdown",
"id": "482938d9",
"metadata": {},
"source": [
"## Filtriranje\n",
"\n",
"Izbor določenih vrstic razpredelnice naredimo tako, da za indeks podamo stolpec logičnih vrednosti, ki ga dobimo z običajnimi operacijami. V vrnjeni razpredelnici bodo ostale vrstice, pri katerih je v stolpcu vrednost `True`."
]
},
{
"cell_type": "code",
"execution_count": 10,
"id": "4edb5c50",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"id\n",
"4972 False\n",
"6864 False\n",
"9968 False\n",
"10323 True\n",
"12349 True\n",
" ... \n",
"18568902 True\n",
"18689424 False\n",
"18968540 False\n",
"20850406 True\n",
"21279138 False\n",
"Name: ocena, Length: 9999, dtype: bool"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi.ocena >= 8"
]
},
{
"cell_type": "code",
"execution_count": 11,
"id": "87e652a1",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" naslov | \n",
" dolzina | \n",
" leto | \n",
" ocena | \n",
" metascore | \n",
" glasovi | \n",
" zasluzek | \n",
" oznaka | \n",
" opis | \n",
"
\n",
" \n",
" id | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 111161 | \n",
" The Shawshank Redemption | \n",
" 142 | \n",
" 1994 | \n",
" 9.3 | \n",
" 81.0 | \n",
" 2651625 | \n",
" 28341469.0 | \n",
" R | \n",
" Two imprisoned men bond over a number of years... | \n",
"
\n",
" \n",
" 15327088 | \n",
" Kantara | \n",
" 148 | \n",
" 2022 | \n",
" 9.4 | \n",
" NaN | \n",
" 33294 | \n",
" NaN | \n",
" NaN | \n",
" It involves culture of Kambla and Bhootha Kola... | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" naslov dolzina leto ocena metascore glasovi \\\n",
"id \n",
"111161 The Shawshank Redemption 142 1994 9.3 81.0 2651625 \n",
"15327088 Kantara 148 2022 9.4 NaN 33294 \n",
"\n",
" zasluzek oznaka opis \n",
"id \n",
"111161 28341469.0 R Two imprisoned men bond over a number of years... \n",
"15327088 NaN NaN It involves culture of Kambla and Bhootha Kola... "
]
},
"execution_count": 11,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi[filmi.ocena >= 9.3]"
]
},
{
"cell_type": "code",
"execution_count": 12,
"id": "26aac5a2",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" naslov | \n",
" dolzina | \n",
" leto | \n",
" ocena | \n",
" metascore | \n",
" glasovi | \n",
" zasluzek | \n",
" oznaka | \n",
" opis | \n",
"
\n",
" \n",
" id | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 52077 | \n",
" Plan 9 from Outer Space | \n",
" 79 | \n",
" 1957 | \n",
" 3.9 | \n",
" 56.0 | \n",
" 38744 | \n",
" NaN | \n",
" Not Rated | \n",
" Evil aliens attack Earth and set their terribl... | \n",
"
\n",
" \n",
" 54673 | \n",
" The Beast of Yucca Flats | \n",
" 54 | \n",
" 1961 | \n",
" 1.8 | \n",
" NaN | \n",
" 11242 | \n",
" NaN | \n",
" Unrated | \n",
" A defecting Soviet scientist is hit by a nucle... | \n",
"
\n",
" \n",
" 58548 | \n",
" Santa Claus Conquers the Martians | \n",
" 81 | \n",
" 1964 | \n",
" 2.6 | \n",
" NaN | \n",
" 11838 | \n",
" NaN | \n",
" Not Rated | \n",
" The Martians kidnap Santa Claus because there ... | \n",
"
\n",
" \n",
" 59464 | \n",
" Monster a Go-Go | \n",
" 68 | \n",
" 1965 | \n",
" 1.7 | \n",
" NaN | \n",
" 11138 | \n",
" NaN | \n",
" TV-PG | \n",
" A space capsule crash-lands on Earth, and the ... | \n",
"
\n",
" \n",
" 60666 | \n",
" Manos: The Hands of Fate | \n",
" 70 | \n",
" 1966 | \n",
" 1.6 | \n",
" NaN | \n",
" 36445 | \n",
" NaN | \n",
" Not Rated | \n",
" A family gets lost on the road and stumbles up... | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 15654262 | \n",
" Chup | \n",
" 135 | \n",
" 2022 | \n",
" 8.4 | \n",
" NaN | \n",
" 13098 | \n",
" NaN | \n",
" NaN | \n",
" A psychopath killer, targeting film critics. T... | \n",
"
\n",
" \n",
" 16492678 | \n",
" Demon Slayer: Kimetsu no Yaiba - Tsuzumi Mansi... | \n",
" 87 | \n",
" 2021 | \n",
" 9.0 | \n",
" NaN | \n",
" 12634 | \n",
" NaN | \n",
" NaN | \n",
" Tanjiro ventures to the south-southeast where ... | \n",
"
\n",
" \n",
" 18568902 | \n",
" Kaun Pravin Tambe? | \n",
" 134 | \n",
" 2022 | \n",
" 8.4 | \n",
" NaN | \n",
" 10163 | \n",
" NaN | \n",
" NaN | \n",
" An indian cricketer who shows persistence and ... | \n",
"
\n",
" \n",
" 20850406 | \n",
" Sita Ramam | \n",
" 163 | \n",
" 2022 | \n",
" 8.5 | \n",
" NaN | \n",
" 38490 | \n",
" NaN | \n",
" NaN | \n",
" An orphan soldier, Lieutenant Ram's life chang... | \n",
"
\n",
" \n",
" 21279138 | \n",
" Maid in Malacañang | \n",
" 114 | \n",
" 2022 | \n",
" 3.9 | \n",
" NaN | \n",
" 15273 | \n",
" NaN | \n",
" NaN | \n",
" The Last Days of Ferdinand and Imelda Marcos t... | \n",
"
\n",
" \n",
"
\n",
"
695 rows × 9 columns
\n",
"
"
],
"text/plain": [
" naslov dolzina leto \\\n",
"id \n",
"52077 Plan 9 from Outer Space 79 1957 \n",
"54673 The Beast of Yucca Flats 54 1961 \n",
"58548 Santa Claus Conquers the Martians 81 1964 \n",
"59464 Monster a Go-Go 68 1965 \n",
"60666 Manos: The Hands of Fate 70 1966 \n",
"... ... ... ... \n",
"15654262 Chup 135 2022 \n",
"16492678 Demon Slayer: Kimetsu no Yaiba - Tsuzumi Mansi... 87 2021 \n",
"18568902 Kaun Pravin Tambe? 134 2022 \n",
"20850406 Sita Ramam 163 2022 \n",
"21279138 Maid in Malacañang 114 2022 \n",
"\n",
" ocena metascore glasovi zasluzek oznaka \\\n",
"id \n",
"52077 3.9 56.0 38744 NaN Not Rated \n",
"54673 1.8 NaN 11242 NaN Unrated \n",
"58548 2.6 NaN 11838 NaN Not Rated \n",
"59464 1.7 NaN 11138 NaN TV-PG \n",
"60666 1.6 NaN 36445 NaN Not Rated \n",
"... ... ... ... ... ... \n",
"15654262 8.4 NaN 13098 NaN NaN \n",
"16492678 9.0 NaN 12634 NaN NaN \n",
"18568902 8.4 NaN 10163 NaN NaN \n",
"20850406 8.5 NaN 38490 NaN NaN \n",
"21279138 3.9 NaN 15273 NaN NaN \n",
"\n",
" opis \n",
"id \n",
"52077 Evil aliens attack Earth and set their terribl... \n",
"54673 A defecting Soviet scientist is hit by a nucle... \n",
"58548 The Martians kidnap Santa Claus because there ... \n",
"59464 A space capsule crash-lands on Earth, and the ... \n",
"60666 A family gets lost on the road and stumbles up... \n",
"... ... \n",
"15654262 A psychopath killer, targeting film critics. T... \n",
"16492678 Tanjiro ventures to the south-southeast where ... \n",
"18568902 An indian cricketer who shows persistence and ... \n",
"20850406 An orphan soldier, Lieutenant Ram's life chang... \n",
"21279138 The Last Days of Ferdinand and Imelda Marcos t... \n",
"\n",
"[695 rows x 9 columns]"
]
},
"execution_count": 12,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi[(filmi.leto > 2010) & (filmi.ocena > 8) | (filmi.ocena < 5)]"
]
},
{
"cell_type": "markdown",
"id": "c9acb1ca",
"metadata": {},
"source": [
"### Naloga\n",
"\n",
"Poiščite filme, ki si jih želimo izogniti za vsako ceno, torej tiste, ki so daljši od dveh ur in imajo oceno pod 4."
]
},
{
"cell_type": "code",
"execution_count": 13,
"id": "581eda77",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" naslov | \n",
" dolzina | \n",
" leto | \n",
" ocena | \n",
" metascore | \n",
" glasovi | \n",
" zasluzek | \n",
" oznaka | \n",
" opis | \n",
"
\n",
" \n",
" id | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 118688 | \n",
" Batman & Robin | \n",
" 125 | \n",
" 1997 | \n",
" 3.7 | \n",
" 28.0 | \n",
" 253972 | \n",
" 107325195.0 | \n",
" PG-13 | \n",
" Batman and Robin try to keep their relationshi... | \n",
"
\n",
" \n",
" 120179 | \n",
" Speed 2: Cruise Control | \n",
" 121 | \n",
" 1997 | \n",
" 3.9 | \n",
" 23.0 | \n",
" 81714 | \n",
" 48608066.0 | \n",
" PG-13 | \n",
" A computer hacker breaks into the computer sys... | \n",
"
\n",
" \n",
" 2574698 | \n",
" Gunday | \n",
" 152 | \n",
" 2014 | \n",
" 2.6 | \n",
" NaN | \n",
" 59270 | \n",
" NaN | \n",
" Not Rated | \n",
" The lives of Calcutta's most powerful Gunday, ... | \n",
"
\n",
" \n",
" 7886848 | \n",
" Sadak 2 | \n",
" 133 | \n",
" 2020 | \n",
" 1.1 | \n",
" NaN | \n",
" 95865 | \n",
" NaN | \n",
" TV-MA | \n",
" The film picks up where Sadak left off, revolv... | \n",
"
\n",
" \n",
" 10350922 | \n",
" Laxmii | \n",
" 141 | \n",
" 2020 | \n",
" 2.6 | \n",
" NaN | \n",
" 57411 | \n",
" NaN | \n",
" TV-MA | \n",
" Aasif visits his wife's parents' house and hap... | \n",
"
\n",
" \n",
" 10888594 | \n",
" Radhe | \n",
" 135 | \n",
" 2021 | \n",
" 1.9 | \n",
" NaN | \n",
" 177814 | \n",
" NaN | \n",
" TV-MA | \n",
" After taking the dreaded gangster Gani Bhai, A... | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" naslov dolzina leto ocena metascore glasovi \\\n",
"id \n",
"118688 Batman & Robin 125 1997 3.7 28.0 253972 \n",
"120179 Speed 2: Cruise Control 121 1997 3.9 23.0 81714 \n",
"2574698 Gunday 152 2014 2.6 NaN 59270 \n",
"7886848 Sadak 2 133 2020 1.1 NaN 95865 \n",
"10350922 Laxmii 141 2020 2.6 NaN 57411 \n",
"10888594 Radhe 135 2021 1.9 NaN 177814 \n",
"\n",
" zasluzek oznaka \\\n",
"id \n",
"118688 107325195.0 PG-13 \n",
"120179 48608066.0 PG-13 \n",
"2574698 NaN Not Rated \n",
"7886848 NaN TV-MA \n",
"10350922 NaN TV-MA \n",
"10888594 NaN TV-MA \n",
"\n",
" opis \n",
"id \n",
"118688 Batman and Robin try to keep their relationshi... \n",
"120179 A computer hacker breaks into the computer sys... \n",
"2574698 The lives of Calcutta's most powerful Gunday, ... \n",
"7886848 The film picks up where Sadak left off, revolv... \n",
"10350922 Aasif visits his wife's parents' house and hap... \n",
"10888594 After taking the dreaded gangster Gani Bhai, A... "
]
},
"execution_count": 13,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi[(filmi.dolzina > 120) & (filmi.ocena < 4) & (filmi.glasovi > 50000)]"
]
},
{
"cell_type": "markdown",
"id": "c7bf000b",
"metadata": {},
"source": [
"## Urejanje\n",
"\n",
"Razpredelnico urejamo z metodo `.sort_values`, ki ji podamo ime ali seznam imen stolpcev, po katerih želimo urejati. Po želji lahko tudi povemo, kateri stolpci naj bodo urejeni naraščajoče in kateri padajoče."
]
},
{
"cell_type": "code",
"execution_count": 14,
"id": "0bab9d68",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" naslov | \n",
" dolzina | \n",
" leto | \n",
" ocena | \n",
" metascore | \n",
" glasovi | \n",
" zasluzek | \n",
" oznaka | \n",
" opis | \n",
"
\n",
" \n",
" id | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 2061702 | \n",
" To the Forest of Firefly Lights | \n",
" 45 | \n",
" 2011 | \n",
" 7.8 | \n",
" NaN | \n",
" 18535 | \n",
" NaN | \n",
" NaN | \n",
" Hotaru is rescued by a spirit when she gets lo... | \n",
"
\n",
" \n",
" 15324 | \n",
" Sherlock Jr. | \n",
" 45 | \n",
" 1924 | \n",
" 8.2 | \n",
" NaN | \n",
" 50180 | \n",
" 977375.0 | \n",
" Passed | \n",
" A film projectionist longs to be a detective, ... | \n",
"
\n",
" \n",
" 2591814 | \n",
" The Garden of Words | \n",
" 46 | \n",
" 2013 | \n",
" 7.4 | \n",
" NaN | \n",
" 44624 | \n",
" NaN | \n",
" TV-14 | \n",
" A 15-year-old boy and 27-year-old woman find a... | \n",
"
\n",
" \n",
" 275230 | \n",
" Blood: The Last Vampire | \n",
" 48 | \n",
" 2000 | \n",
" 6.6 | \n",
" 44.0 | \n",
" 12761 | \n",
" NaN | \n",
" Not Rated | \n",
" Saya is a Japanese vampire slayer whose next m... | \n",
"
\n",
" \n",
" 142236 | \n",
" Dragon Ball Z: Revival Fusion | \n",
" 51 | \n",
" 1995 | \n",
" 7.6 | \n",
" NaN | \n",
" 11050 | \n",
" NaN | \n",
" PG | \n",
" The universe is thrown into dimensional chaos ... | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 107007 | \n",
" Gettysburg | \n",
" 271 | \n",
" 1993 | \n",
" 7.6 | \n",
" NaN | \n",
" 29479 | \n",
" 10769960.0 | \n",
" PG | \n",
" In 1863, the Northern and Southern forces figh... | \n",
"
\n",
" \n",
" 74084 | \n",
" 1900 | \n",
" 317 | \n",
" 1976 | \n",
" 7.7 | \n",
" 70.0 | \n",
" 25679 | \n",
" NaN | \n",
" Unrated | \n",
" The epic tale of a class struggle in twentieth... | \n",
"
\n",
" \n",
" 1954470 | \n",
" Gangs of Wasseypur | \n",
" 321 | \n",
" 2012 | \n",
" 8.2 | \n",
" 89.0 | \n",
" 96141 | \n",
" NaN | \n",
" Not Rated | \n",
" A clash between Sultan and Shahid Khan leads t... | \n",
"
\n",
" \n",
" 346336 | \n",
" The Best of Youth | \n",
" 366 | \n",
" 2003 | \n",
" 8.5 | \n",
" 89.0 | \n",
" 22119 | \n",
" 274024.0 | \n",
" R | \n",
" An Italian epic that follows the lives of two ... | \n",
"
\n",
" \n",
" 111341 | \n",
" Satantango | \n",
" 439 | \n",
" 1994 | \n",
" 8.3 | \n",
" NaN | \n",
" 11214 | \n",
" NaN | \n",
" Not Rated | \n",
" On the eve of a large payment, residents of a ... | \n",
"
\n",
" \n",
"
\n",
"
9999 rows × 9 columns
\n",
"
"
],
"text/plain": [
" naslov dolzina leto ocena metascore \\\n",
"id \n",
"2061702 To the Forest of Firefly Lights 45 2011 7.8 NaN \n",
"15324 Sherlock Jr. 45 1924 8.2 NaN \n",
"2591814 The Garden of Words 46 2013 7.4 NaN \n",
"275230 Blood: The Last Vampire 48 2000 6.6 44.0 \n",
"142236 Dragon Ball Z: Revival Fusion 51 1995 7.6 NaN \n",
"... ... ... ... ... ... \n",
"107007 Gettysburg 271 1993 7.6 NaN \n",
"74084 1900 317 1976 7.7 70.0 \n",
"1954470 Gangs of Wasseypur 321 2012 8.2 89.0 \n",
"346336 The Best of Youth 366 2003 8.5 89.0 \n",
"111341 Satantango 439 1994 8.3 NaN \n",
"\n",
" glasovi zasluzek oznaka \\\n",
"id \n",
"2061702 18535 NaN NaN \n",
"15324 50180 977375.0 Passed \n",
"2591814 44624 NaN TV-14 \n",
"275230 12761 NaN Not Rated \n",
"142236 11050 NaN PG \n",
"... ... ... ... \n",
"107007 29479 10769960.0 PG \n",
"74084 25679 NaN Unrated \n",
"1954470 96141 NaN Not Rated \n",
"346336 22119 274024.0 R \n",
"111341 11214 NaN Not Rated \n",
"\n",
" opis \n",
"id \n",
"2061702 Hotaru is rescued by a spirit when she gets lo... \n",
"15324 A film projectionist longs to be a detective, ... \n",
"2591814 A 15-year-old boy and 27-year-old woman find a... \n",
"275230 Saya is a Japanese vampire slayer whose next m... \n",
"142236 The universe is thrown into dimensional chaos ... \n",
"... ... \n",
"107007 In 1863, the Northern and Southern forces figh... \n",
"74084 The epic tale of a class struggle in twentieth... \n",
"1954470 A clash between Sultan and Shahid Khan leads t... \n",
"346336 An Italian epic that follows the lives of two ... \n",
"111341 On the eve of a large payment, residents of a ... \n",
"\n",
"[9999 rows x 9 columns]"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi.sort_values('dolzina')"
]
},
{
"cell_type": "code",
"execution_count": 15,
"id": "573492fc",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" naslov | \n",
" dolzina | \n",
" leto | \n",
" ocena | \n",
" metascore | \n",
" glasovi | \n",
" zasluzek | \n",
" oznaka | \n",
" opis | \n",
"
\n",
" \n",
" id | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 15327088 | \n",
" Kantara | \n",
" 148 | \n",
" 2022 | \n",
" 9.4 | \n",
" NaN | \n",
" 33294 | \n",
" NaN | \n",
" NaN | \n",
" It involves culture of Kambla and Bhootha Kola... | \n",
"
\n",
" \n",
" 111161 | \n",
" The Shawshank Redemption | \n",
" 142 | \n",
" 1994 | \n",
" 9.3 | \n",
" 81.0 | \n",
" 2651625 | \n",
" 28341469.0 | \n",
" R | \n",
" Two imprisoned men bond over a number of years... | \n",
"
\n",
" \n",
" 68646 | \n",
" The Godfather | \n",
" 175 | \n",
" 1972 | \n",
" 9.2 | \n",
" 100.0 | \n",
" 1838099 | \n",
" 134966411.0 | \n",
" R | \n",
" The aging patriarch of an organized crime dyna... | \n",
"
\n",
" \n",
" 252487 | \n",
" The Chaos Class | \n",
" 87 | \n",
" 1975 | \n",
" 9.2 | \n",
" NaN | \n",
" 40747 | \n",
" NaN | \n",
" NaN | \n",
" Lazy, uneducated students share a very close b... | \n",
"
\n",
" \n",
" 50083 | \n",
" 12 Angry Men | \n",
" 96 | \n",
" 1957 | \n",
" 9.0 | \n",
" 96.0 | \n",
" 782923 | \n",
" 4360000.0 | \n",
" Approved | \n",
" The jury in a New York City murder trial is fr... | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 421051 | \n",
" Daniel the Wizard | \n",
" 81 | \n",
" 2004 | \n",
" 1.2 | \n",
" NaN | \n",
" 14413 | \n",
" NaN | \n",
" Not Rated | \n",
" Evil assassins want to kill Daniel Kublbock, t... | \n",
"
\n",
" \n",
" 6038600 | \n",
" Smolensk | \n",
" 120 | \n",
" 2016 | \n",
" 1.2 | \n",
" NaN | \n",
" 39704 | \n",
" NaN | \n",
" NaN | \n",
" An inspired story of people affected by the tr... | \n",
"
\n",
" \n",
" 7886848 | \n",
" Sadak 2 | \n",
" 133 | \n",
" 2020 | \n",
" 1.1 | \n",
" NaN | \n",
" 95865 | \n",
" NaN | \n",
" TV-MA | \n",
" The film picks up where Sadak left off, revolv... | \n",
"
\n",
" \n",
" 5988370 | \n",
" Reis | \n",
" 108 | \n",
" 2017 | \n",
" 1.0 | \n",
" NaN | \n",
" 73382 | \n",
" NaN | \n",
" NaN | \n",
" A drama about the early life of Recep Tayyip E... | \n",
"
\n",
" \n",
" 7221896 | \n",
" Cumali Ceber | \n",
" 100 | \n",
" 2017 | \n",
" 1.0 | \n",
" NaN | \n",
" 38958 | \n",
" NaN | \n",
" NaN | \n",
" Cumali Ceber goes to a vacation with his child... | \n",
"
\n",
" \n",
"
\n",
"
9999 rows × 9 columns
\n",
"
"
],
"text/plain": [
" naslov dolzina leto ocena metascore glasovi \\\n",
"id \n",
"15327088 Kantara 148 2022 9.4 NaN 33294 \n",
"111161 The Shawshank Redemption 142 1994 9.3 81.0 2651625 \n",
"68646 The Godfather 175 1972 9.2 100.0 1838099 \n",
"252487 The Chaos Class 87 1975 9.2 NaN 40747 \n",
"50083 12 Angry Men 96 1957 9.0 96.0 782923 \n",
"... ... ... ... ... ... ... \n",
"421051 Daniel the Wizard 81 2004 1.2 NaN 14413 \n",
"6038600 Smolensk 120 2016 1.2 NaN 39704 \n",
"7886848 Sadak 2 133 2020 1.1 NaN 95865 \n",
"5988370 Reis 108 2017 1.0 NaN 73382 \n",
"7221896 Cumali Ceber 100 2017 1.0 NaN 38958 \n",
"\n",
" zasluzek oznaka \\\n",
"id \n",
"15327088 NaN NaN \n",
"111161 28341469.0 R \n",
"68646 134966411.0 R \n",
"252487 NaN NaN \n",
"50083 4360000.0 Approved \n",
"... ... ... \n",
"421051 NaN Not Rated \n",
"6038600 NaN NaN \n",
"7886848 NaN TV-MA \n",
"5988370 NaN NaN \n",
"7221896 NaN NaN \n",
"\n",
" opis \n",
"id \n",
"15327088 It involves culture of Kambla and Bhootha Kola... \n",
"111161 Two imprisoned men bond over a number of years... \n",
"68646 The aging patriarch of an organized crime dyna... \n",
"252487 Lazy, uneducated students share a very close b... \n",
"50083 The jury in a New York City murder trial is fr... \n",
"... ... \n",
"421051 Evil assassins want to kill Daniel Kublbock, t... \n",
"6038600 An inspired story of people affected by the tr... \n",
"7886848 The film picks up where Sadak left off, revolv... \n",
"5988370 A drama about the early life of Recep Tayyip E... \n",
"7221896 Cumali Ceber goes to a vacation with his child... \n",
"\n",
"[9999 rows x 9 columns]"
]
},
"execution_count": 15,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# najprej uredi padajoče po oceni, pri vsaki oceni pa še naraščajoče po letu\n",
"filmi.sort_values(['ocena', 'leto'], ascending=[False, True])"
]
},
{
"cell_type": "markdown",
"id": "ee831609",
"metadata": {},
"source": [
"## Združevanje\n",
"\n",
"Z metodo `.groupby` ustvarimo razpredelnico posebne vrste, v katerem so vrstice združene glede na skupno lastnost."
]
},
{
"cell_type": "code",
"execution_count": 16,
"id": "9f1340d9",
"metadata": {},
"outputs": [],
"source": [
"filmi_po_letih = filmi.groupby('leto')"
]
},
{
"cell_type": "code",
"execution_count": 17,
"id": "b3dfc27a",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
""
]
},
"execution_count": 17,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi_po_letih"
]
},
{
"cell_type": "code",
"execution_count": 18,
"id": "96848f15",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"leto\n",
"1915 6.200000\n",
"1916 7.700000\n",
"1919 7.300000\n",
"1920 8.000000\n",
"1921 8.150000\n",
" ... \n",
"2018 6.430748\n",
"2019 6.493051\n",
"2020 6.144304\n",
"2021 6.369742\n",
"2022 6.361628\n",
"Name: ocena, Length: 106, dtype: float64"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# povprečna ocena vsakega leta\n",
"filmi_po_letih.ocena.mean()"
]
},
{
"cell_type": "markdown",
"id": "980188d2",
"metadata": {},
"source": [
"Če želimo, lahko združujemo tudi po izračunanih lastnostih. Izračunajmo stolpec in ga shranimo v razpredelnico."
]
},
{
"cell_type": "code",
"execution_count": 19,
"id": "3db1909f",
"metadata": {},
"outputs": [],
"source": [
"filmi['desetletje'] = 10 * (filmi.leto // 10)"
]
},
{
"cell_type": "code",
"execution_count": 20,
"id": "1d1d4ed8",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" naslov | \n",
" dolzina | \n",
" leto | \n",
" ocena | \n",
" metascore | \n",
" glasovi | \n",
" zasluzek | \n",
" oznaka | \n",
" opis | \n",
" desetletje | \n",
"
\n",
" \n",
" id | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 4972 | \n",
" The Birth of a Nation | \n",
" 195 | \n",
" 1915 | \n",
" 6.2 | \n",
" NaN | \n",
" 24890 | \n",
" 10000000.0 | \n",
" TV-PG | \n",
" The Stoneman family finds its friendship with ... | \n",
" 1910 | \n",
"
\n",
" \n",
" 6864 | \n",
" Intolerance | \n",
" 197 | \n",
" 1916 | \n",
" 7.7 | \n",
" 99.0 | \n",
" 15670 | \n",
" 2180000.0 | \n",
" Passed | \n",
" The story of a poor young woman separated by p... | \n",
" 1910 | \n",
"
\n",
" \n",
" 9968 | \n",
" Broken Blossoms | \n",
" 90 | \n",
" 1919 | \n",
" 7.3 | \n",
" NaN | \n",
" 10423 | \n",
" NaN | \n",
" Not Rated | \n",
" A frail waif, abused by her brutal boxer fathe... | \n",
" 1910 | \n",
"
\n",
" \n",
" 10323 | \n",
" The Cabinet of Dr. Caligari | \n",
" 67 | \n",
" 1920 | \n",
" 8.0 | \n",
" NaN | \n",
" 64133 | \n",
" NaN | \n",
" Not Rated | \n",
" Hypnotist Dr. Caligari uses a somnambulist, Ce... | \n",
" 1920 | \n",
"
\n",
" \n",
" 12349 | \n",
" The Kid | \n",
" 68 | \n",
" 1921 | \n",
" 8.3 | \n",
" NaN | \n",
" 126513 | \n",
" 5450000.0 | \n",
" Passed | \n",
" The Tramp cares for an abandoned child, but ev... | \n",
" 1920 | \n",
"
\n",
" \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
" ... | \n",
"
\n",
" \n",
" 18568902 | \n",
" Kaun Pravin Tambe? | \n",
" 134 | \n",
" 2022 | \n",
" 8.4 | \n",
" NaN | \n",
" 10163 | \n",
" NaN | \n",
" NaN | \n",
" An indian cricketer who shows persistence and ... | \n",
" 2020 | \n",
"
\n",
" \n",
" 18689424 | \n",
" Batman v Superman: Dawn of Justice - Ultimate ... | \n",
" 182 | \n",
" 2016 | \n",
" 7.1 | \n",
" NaN | \n",
" 57662 | \n",
" NaN | \n",
" R | \n",
" Batman is manipulated by Lex Luthor to fear Su... | \n",
" 2010 | \n",
"
\n",
" \n",
" 18968540 | \n",
" Incantation | \n",
" 110 | \n",
" 2022 | \n",
" 6.2 | \n",
" NaN | \n",
" 12366 | \n",
" NaN | \n",
" TV-MA | \n",
" Six years ago, Li Ronan was cursed after break... | \n",
" 2020 | \n",
"
\n",
" \n",
" 20850406 | \n",
" Sita Ramam | \n",
" 163 | \n",
" 2022 | \n",
" 8.5 | \n",
" NaN | \n",
" 38490 | \n",
" NaN | \n",
" NaN | \n",
" An orphan soldier, Lieutenant Ram's life chang... | \n",
" 2020 | \n",
"
\n",
" \n",
" 21279138 | \n",
" Maid in Malacañang | \n",
" 114 | \n",
" 2022 | \n",
" 3.9 | \n",
" NaN | \n",
" 15273 | \n",
" NaN | \n",
" NaN | \n",
" The Last Days of Ferdinand and Imelda Marcos t... | \n",
" 2020 | \n",
"
\n",
" \n",
"
\n",
"
9999 rows × 10 columns
\n",
"
"
],
"text/plain": [
" naslov dolzina leto \\\n",
"id \n",
"4972 The Birth of a Nation 195 1915 \n",
"6864 Intolerance 197 1916 \n",
"9968 Broken Blossoms 90 1919 \n",
"10323 The Cabinet of Dr. Caligari 67 1920 \n",
"12349 The Kid 68 1921 \n",
"... ... ... ... \n",
"18568902 Kaun Pravin Tambe? 134 2022 \n",
"18689424 Batman v Superman: Dawn of Justice - Ultimate ... 182 2016 \n",
"18968540 Incantation 110 2022 \n",
"20850406 Sita Ramam 163 2022 \n",
"21279138 Maid in Malacañang 114 2022 \n",
"\n",
" ocena metascore glasovi zasluzek oznaka \\\n",
"id \n",
"4972 6.2 NaN 24890 10000000.0 TV-PG \n",
"6864 7.7 99.0 15670 2180000.0 Passed \n",
"9968 7.3 NaN 10423 NaN Not Rated \n",
"10323 8.0 NaN 64133 NaN Not Rated \n",
"12349 8.3 NaN 126513 5450000.0 Passed \n",
"... ... ... ... ... ... \n",
"18568902 8.4 NaN 10163 NaN NaN \n",
"18689424 7.1 NaN 57662 NaN R \n",
"18968540 6.2 NaN 12366 NaN TV-MA \n",
"20850406 8.5 NaN 38490 NaN NaN \n",
"21279138 3.9 NaN 15273 NaN NaN \n",
"\n",
" opis desetletje \n",
"id \n",
"4972 The Stoneman family finds its friendship with ... 1910 \n",
"6864 The story of a poor young woman separated by p... 1910 \n",
"9968 A frail waif, abused by her brutal boxer fathe... 1910 \n",
"10323 Hypnotist Dr. Caligari uses a somnambulist, Ce... 1920 \n",
"12349 The Tramp cares for an abandoned child, but ev... 1920 \n",
"... ... ... \n",
"18568902 An indian cricketer who shows persistence and ... 2020 \n",
"18689424 Batman is manipulated by Lex Luthor to fear Su... 2010 \n",
"18968540 Six years ago, Li Ronan was cursed after break... 2020 \n",
"20850406 An orphan soldier, Lieutenant Ram's life chang... 2020 \n",
"21279138 The Last Days of Ferdinand and Imelda Marcos t... 2020 \n",
"\n",
"[9999 rows x 10 columns]"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi"
]
},
{
"cell_type": "code",
"execution_count": 21,
"id": "4f0fe759",
"metadata": {},
"outputs": [],
"source": [
"filmi_po_desetletjih = filmi.groupby('desetletje')"
]
},
{
"cell_type": "markdown",
"id": "c0d39c61",
"metadata": {},
"source": [
"Preštejemo, koliko filmov je bilo v vsakem desetletju. Pri večini stolpcev dobimo iste številke, ker imamo v vsakem stolpcu enako vnosov. Če kje kakšen podatek manjkal, je številka manjša."
]
},
{
"cell_type": "code",
"execution_count": 22,
"id": "46d8cd3b",
"metadata": {},
"outputs": [
{
"data": {
"text/html": [
"\n",
"\n",
"
\n",
" \n",
" \n",
" | \n",
" naslov | \n",
" dolzina | \n",
" leto | \n",
" ocena | \n",
" metascore | \n",
" glasovi | \n",
" zasluzek | \n",
" oznaka | \n",
" opis | \n",
"
\n",
" \n",
" desetletje | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
" | \n",
"
\n",
" \n",
" \n",
" \n",
" 1910 | \n",
" 3 | \n",
" 3 | \n",
" 3 | \n",
" 3 | \n",
" 1 | \n",
" 3 | \n",
" 2 | \n",
" 3 | \n",
" 3 | \n",
"
\n",
" \n",
" 1920 | \n",
" 27 | \n",
" 27 | \n",
" 27 | \n",
" 27 | \n",
" 4 | \n",
" 27 | \n",
" 18 | \n",
" 27 | \n",
" 27 | \n",
"
\n",
" \n",
" 1930 | \n",
" 80 | \n",
" 80 | \n",
" 80 | \n",
" 80 | \n",
" 39 | \n",
" 80 | \n",
" 36 | \n",
" 80 | \n",
" 80 | \n",
"
\n",
" \n",
" 1940 | \n",
" 134 | \n",
" 134 | \n",
" 134 | \n",
" 134 | \n",
" 63 | \n",
" 134 | \n",
" 46 | \n",
" 133 | \n",
" 134 | \n",
"
\n",
" \n",
" 1950 | \n",
" 205 | \n",
" 205 | \n",
" 205 | \n",
" 205 | \n",
" 113 | \n",
" 205 | \n",
" 92 | \n",
" 205 | \n",
" 205 | \n",
"
\n",
" \n",
" 1960 | \n",
" 284 | \n",
" 284 | \n",
" 284 | \n",
" 284 | \n",
" 172 | \n",
" 284 | \n",
" 150 | \n",
" 281 | \n",
" 284 | \n",
"
\n",
" \n",
" 1970 | \n",
" 410 | \n",
" 410 | \n",
" 410 | \n",
" 410 | \n",
" 323 | \n",
" 410 | \n",
" 276 | \n",
" 394 | \n",
" 410 | \n",
"
\n",
" \n",
" 1980 | \n",
" 823 | \n",
" 823 | \n",
" 823 | \n",
" 823 | \n",
" 721 | \n",
" 823 | \n",
" 711 | \n",
" 809 | \n",
" 823 | \n",
"
\n",
" \n",
" 1990 | \n",
" 1420 | \n",
" 1420 | \n",
" 1420 | \n",
" 1420 | \n",
" 1128 | \n",
" 1420 | \n",
" 1324 | \n",
" 1399 | \n",
" 1420 | \n",
"
\n",
" \n",
" 2000 | \n",
" 2575 | \n",
" 2575 | \n",
" 2575 | \n",
" 2575 | \n",
" 2183 | \n",
" 2575 | \n",
" 2203 | \n",
" 2507 | \n",
" 2575 | \n",
"
\n",
" \n",
" 2010 | \n",
" 3358 | \n",
" 3358 | \n",
" 3358 | \n",
" 3358 | \n",
" 2728 | \n",
" 3358 | \n",
" 2353 | \n",
" 3228 | \n",
" 3358 | \n",
"
\n",
" \n",
" 2020 | \n",
" 680 | \n",
" 680 | \n",
" 680 | \n",
" 680 | \n",
" 500 | \n",
" 680 | \n",
" 47 | \n",
" 586 | \n",
" 680 | \n",
"
\n",
" \n",
"
\n",
"
"
],
"text/plain": [
" naslov dolzina leto ocena metascore glasovi zasluzek \\\n",
"desetletje \n",
"1910 3 3 3 3 1 3 2 \n",
"1920 27 27 27 27 4 27 18 \n",
"1930 80 80 80 80 39 80 36 \n",
"1940 134 134 134 134 63 134 46 \n",
"1950 205 205 205 205 113 205 92 \n",
"1960 284 284 284 284 172 284 150 \n",
"1970 410 410 410 410 323 410 276 \n",
"1980 823 823 823 823 721 823 711 \n",
"1990 1420 1420 1420 1420 1128 1420 1324 \n",
"2000 2575 2575 2575 2575 2183 2575 2203 \n",
"2010 3358 3358 3358 3358 2728 3358 2353 \n",
"2020 680 680 680 680 500 680 47 \n",
"\n",
" oznaka opis \n",
"desetletje \n",
"1910 3 3 \n",
"1920 27 27 \n",
"1930 80 80 \n",
"1940 133 134 \n",
"1950 205 205 \n",
"1960 281 284 \n",
"1970 394 410 \n",
"1980 809 823 \n",
"1990 1399 1420 \n",
"2000 2507 2575 \n",
"2010 3228 3358 \n",
"2020 586 680 "
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi_po_desetletjih.count()"
]
},
{
"cell_type": "markdown",
"id": "217e0fe0",
"metadata": {},
"source": [
"Če želimo dobiti le število članov posamezne skupine, uporabimo metodo `.size()`. V tem primeru dobimo le stolpec, ne razpredelnice."
]
},
{
"cell_type": "code",
"execution_count": 23,
"id": "74307cc2",
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"desetletje\n",
"1910 3\n",
"1920 27\n",
"1930 80\n",
"1940 134\n",
"1950 205\n",
"1960 284\n",
"1970 410\n",
"1980 823\n",
"1990 1420\n",
"2000 2575\n",
"2010 3358\n",
"2020 680\n",
"dtype: int64"
]
},
"execution_count": 23,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"filmi_po_desetletjih.size()"
]
},
{
"cell_type": "markdown",
"id": "9ee12965",
"metadata": {},
"source": [
"Pogledamo povprečja vsakega desetletja. Dobimo povprečno leto, dolžino, ocene in zaslužek. Povprečnega naslova ne dobimo, ker se ga ne da izračunati, zato ustreznega stolpca ni."
]
},
{
"cell_type": "code",
"execution_count": 24,
"id": "b5115c01",
"metadata": {},
"outputs": [
{
"ename": "TypeError",
"evalue": "agg function failed [how->mean,dtype->object]",
"output_type": "error",
"traceback": [
"\u001b[0;31m---------------------------------------------------------------------------\u001b[0m",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/groupby.py:1870\u001b[0m, in \u001b[0;36mGroupBy._agg_py_fallback\u001b[0;34m(self, how, values, ndim, alt)\u001b[0m\n\u001b[1;32m 1869\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m-> 1870\u001b[0m res_values \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mgrouper\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43magg_series\u001b[49m\u001b[43m(\u001b[49m\u001b[43mser\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43malt\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mpreserve_dtype\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\u001b[43m)\u001b[49m\n\u001b[1;32m 1871\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mException\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m err:\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/ops.py:850\u001b[0m, in \u001b[0;36mBaseGrouper.agg_series\u001b[0;34m(self, obj, func, preserve_dtype)\u001b[0m\n\u001b[1;32m 848\u001b[0m preserve_dtype \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mTrue\u001b[39;00m\n\u001b[0;32m--> 850\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_aggregate_series_pure_python\u001b[49m\u001b[43m(\u001b[49m\u001b[43mobj\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mfunc\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 852\u001b[0m npvalues \u001b[38;5;241m=\u001b[39m lib\u001b[38;5;241m.\u001b[39mmaybe_convert_objects(result, try_float\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mFalse\u001b[39;00m)\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/ops.py:871\u001b[0m, in \u001b[0;36mBaseGrouper._aggregate_series_pure_python\u001b[0;34m(self, obj, func)\u001b[0m\n\u001b[1;32m 870\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m i, group \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28menumerate\u001b[39m(splitter):\n\u001b[0;32m--> 871\u001b[0m res \u001b[38;5;241m=\u001b[39m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[43mgroup\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 872\u001b[0m res \u001b[38;5;241m=\u001b[39m extract_result(res)\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/groupby.py:2376\u001b[0m, in \u001b[0;36mGroupBy.mean..\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 2373\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 2374\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_cython_agg_general(\n\u001b[1;32m 2375\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmean\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[0;32m-> 2376\u001b[0m alt\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mlambda\u001b[39;00m x: \u001b[43mSeries\u001b[49m\u001b[43m(\u001b[49m\u001b[43mx\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmean\u001b[49m\u001b[43m(\u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnumeric_only\u001b[49m\u001b[43m)\u001b[49m,\n\u001b[1;32m 2377\u001b[0m numeric_only\u001b[38;5;241m=\u001b[39mnumeric_only,\n\u001b[1;32m 2378\u001b[0m )\n\u001b[1;32m 2379\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m result\u001b[38;5;241m.\u001b[39m__finalize__(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mobj, method\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mgroupby\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/series.py:6226\u001b[0m, in \u001b[0;36mSeries.mean\u001b[0;34m(self, axis, skipna, numeric_only, **kwargs)\u001b[0m\n\u001b[1;32m 6218\u001b[0m \u001b[38;5;129m@doc\u001b[39m(make_doc(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmean\u001b[39m\u001b[38;5;124m\"\u001b[39m, ndim\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m1\u001b[39m))\n\u001b[1;32m 6219\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mmean\u001b[39m(\n\u001b[1;32m 6220\u001b[0m \u001b[38;5;28mself\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 6224\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs,\n\u001b[1;32m 6225\u001b[0m ):\n\u001b[0;32m-> 6226\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mNDFrame\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmean\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/generic.py:11969\u001b[0m, in \u001b[0;36mNDFrame.mean\u001b[0;34m(self, axis, skipna, numeric_only, **kwargs)\u001b[0m\n\u001b[1;32m 11962\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mmean\u001b[39m(\n\u001b[1;32m 11963\u001b[0m \u001b[38;5;28mself\u001b[39m,\n\u001b[1;32m 11964\u001b[0m axis: Axis \u001b[38;5;241m|\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m0\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 11967\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs,\n\u001b[1;32m 11968\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m Series \u001b[38;5;241m|\u001b[39m \u001b[38;5;28mfloat\u001b[39m:\n\u001b[0;32m> 11969\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_stat_function\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 11970\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mmean\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnanops\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mnanmean\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\n\u001b[1;32m 11971\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/generic.py:11926\u001b[0m, in \u001b[0;36mNDFrame._stat_function\u001b[0;34m(self, name, func, axis, skipna, numeric_only, **kwargs)\u001b[0m\n\u001b[1;32m 11924\u001b[0m validate_bool_kwarg(skipna, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mskipna\u001b[39m\u001b[38;5;124m\"\u001b[39m, none_allowed\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mFalse\u001b[39;00m)\n\u001b[0;32m> 11926\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_reduce\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 11927\u001b[0m \u001b[43m \u001b[49m\u001b[43mfunc\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mname\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mname\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnumeric_only\u001b[49m\n\u001b[1;32m 11928\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/series.py:6134\u001b[0m, in \u001b[0;36mSeries._reduce\u001b[0;34m(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)\u001b[0m\n\u001b[1;32m 6130\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\n\u001b[1;32m 6131\u001b[0m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mSeries.\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mname\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m does not allow \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mkwd_name\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m=\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mnumeric_only\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 6132\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mwith non-numeric dtypes.\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 6133\u001b[0m )\n\u001b[0;32m-> 6134\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mop\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdelegate\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwds\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/nanops.py:147\u001b[0m, in \u001b[0;36mbottleneck_switch.__call__..f\u001b[0;34m(values, axis, skipna, **kwds)\u001b[0m\n\u001b[1;32m 146\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m--> 147\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[43malt\u001b[49m\u001b[43m(\u001b[49m\u001b[43mvalues\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwds\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 149\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m result\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/nanops.py:404\u001b[0m, in \u001b[0;36m_datetimelike_compat..new_func\u001b[0;34m(values, axis, skipna, mask, **kwargs)\u001b[0m\n\u001b[1;32m 402\u001b[0m mask \u001b[38;5;241m=\u001b[39m isna(values)\n\u001b[0;32m--> 404\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[43mvalues\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mmask\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmask\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 406\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m datetimelike:\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/nanops.py:720\u001b[0m, in \u001b[0;36mnanmean\u001b[0;34m(values, axis, skipna, mask)\u001b[0m\n\u001b[1;32m 719\u001b[0m the_sum \u001b[38;5;241m=\u001b[39m values\u001b[38;5;241m.\u001b[39msum(axis, dtype\u001b[38;5;241m=\u001b[39mdtype_sum)\n\u001b[0;32m--> 720\u001b[0m the_sum \u001b[38;5;241m=\u001b[39m \u001b[43m_ensure_numeric\u001b[49m\u001b[43m(\u001b[49m\u001b[43mthe_sum\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 722\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m axis \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mgetattr\u001b[39m(the_sum, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mndim\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;28;01mFalse\u001b[39;00m):\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/nanops.py:1693\u001b[0m, in \u001b[0;36m_ensure_numeric\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 1691\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(x, \u001b[38;5;28mstr\u001b[39m):\n\u001b[1;32m 1692\u001b[0m \u001b[38;5;66;03m# GH#44008, GH#36703 avoid casting e.g. strings to numeric\u001b[39;00m\n\u001b[0;32m-> 1693\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mCould not convert string \u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mx\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124m to numeric\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 1694\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n",
"\u001b[0;31mTypeError\u001b[0m: Could not convert string 'The Birth of a NationIntoleranceBroken Blossoms' to numeric",
"\nThe above exception was the direct cause of the following exception:\n",
"\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)",
"Cell \u001b[0;32mIn[24], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mfilmi_po_desetletjih\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmean\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/groupby.py:2374\u001b[0m, in \u001b[0;36mGroupBy.mean\u001b[0;34m(self, numeric_only, engine, engine_kwargs)\u001b[0m\n\u001b[1;32m 2367\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_numba_agg_general(\n\u001b[1;32m 2368\u001b[0m grouped_mean,\n\u001b[1;32m 2369\u001b[0m executor\u001b[38;5;241m.\u001b[39mfloat_dtype_mapping,\n\u001b[1;32m 2370\u001b[0m engine_kwargs,\n\u001b[1;32m 2371\u001b[0m min_periods\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m0\u001b[39m,\n\u001b[1;32m 2372\u001b[0m )\n\u001b[1;32m 2373\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m-> 2374\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_cython_agg_general\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 2375\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mmean\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2376\u001b[0m \u001b[43m \u001b[49m\u001b[43malt\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mlambda\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43mx\u001b[49m\u001b[43m:\u001b[49m\u001b[43m \u001b[49m\u001b[43mSeries\u001b[49m\u001b[43m(\u001b[49m\u001b[43mx\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmean\u001b[49m\u001b[43m(\u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnumeric_only\u001b[49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2377\u001b[0m \u001b[43m \u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnumeric_only\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2378\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2379\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m result\u001b[38;5;241m.\u001b[39m__finalize__(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mobj, method\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mgroupby\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/groupby.py:1925\u001b[0m, in \u001b[0;36mGroupBy._cython_agg_general\u001b[0;34m(self, how, alt, numeric_only, min_count, **kwargs)\u001b[0m\n\u001b[1;32m 1922\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_agg_py_fallback(how, values, ndim\u001b[38;5;241m=\u001b[39mdata\u001b[38;5;241m.\u001b[39mndim, alt\u001b[38;5;241m=\u001b[39malt)\n\u001b[1;32m 1923\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m result\n\u001b[0;32m-> 1925\u001b[0m new_mgr \u001b[38;5;241m=\u001b[39m \u001b[43mdata\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mgrouped_reduce\u001b[49m\u001b[43m(\u001b[49m\u001b[43marray_func\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1926\u001b[0m res \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_wrap_agged_manager(new_mgr)\n\u001b[1;32m 1927\u001b[0m out \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_wrap_aggregated_output(res)\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/internals/managers.py:1428\u001b[0m, in \u001b[0;36mBlockManager.grouped_reduce\u001b[0;34m(self, func)\u001b[0m\n\u001b[1;32m 1424\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m blk\u001b[38;5;241m.\u001b[39mis_object:\n\u001b[1;32m 1425\u001b[0m \u001b[38;5;66;03m# split on object-dtype blocks bc some columns may raise\u001b[39;00m\n\u001b[1;32m 1426\u001b[0m \u001b[38;5;66;03m# while others do not.\u001b[39;00m\n\u001b[1;32m 1427\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m sb \u001b[38;5;129;01min\u001b[39;00m blk\u001b[38;5;241m.\u001b[39m_split():\n\u001b[0;32m-> 1428\u001b[0m applied \u001b[38;5;241m=\u001b[39m \u001b[43msb\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mapply\u001b[49m\u001b[43m(\u001b[49m\u001b[43mfunc\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1429\u001b[0m result_blocks \u001b[38;5;241m=\u001b[39m extend_blocks(applied, result_blocks)\n\u001b[1;32m 1430\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/internals/blocks.py:366\u001b[0m, in \u001b[0;36mBlock.apply\u001b[0;34m(self, func, **kwargs)\u001b[0m\n\u001b[1;32m 360\u001b[0m \u001b[38;5;129m@final\u001b[39m\n\u001b[1;32m 361\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mapply\u001b[39m(\u001b[38;5;28mself\u001b[39m, func, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28mlist\u001b[39m[Block]:\n\u001b[1;32m 362\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 363\u001b[0m \u001b[38;5;124;03m apply the function to my values; return a block if we are not\u001b[39;00m\n\u001b[1;32m 364\u001b[0m \u001b[38;5;124;03m one\u001b[39;00m\n\u001b[1;32m 365\u001b[0m \u001b[38;5;124;03m \"\"\"\u001b[39;00m\n\u001b[0;32m--> 366\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mvalues\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 368\u001b[0m result \u001b[38;5;241m=\u001b[39m maybe_coerce_values(result)\n\u001b[1;32m 369\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_split_op_result(result)\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/groupby.py:1922\u001b[0m, in \u001b[0;36mGroupBy._cython_agg_general..array_func\u001b[0;34m(values)\u001b[0m\n\u001b[1;32m 1919\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 1920\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m result\n\u001b[0;32m-> 1922\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_agg_py_fallback\u001b[49m\u001b[43m(\u001b[49m\u001b[43mhow\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mvalues\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mndim\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdata\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mndim\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43malt\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43malt\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1923\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m result\n",
"File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/groupby.py:1874\u001b[0m, in \u001b[0;36mGroupBy._agg_py_fallback\u001b[0;34m(self, how, values, ndim, alt)\u001b[0m\n\u001b[1;32m 1872\u001b[0m msg \u001b[38;5;241m=\u001b[39m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124magg function failed [how->\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mhow\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m,dtype->\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mser\u001b[38;5;241m.\u001b[39mdtype\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m]\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1873\u001b[0m \u001b[38;5;66;03m# preserve the kind of exception that raised\u001b[39;00m\n\u001b[0;32m-> 1874\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;28mtype\u001b[39m(err)(msg) \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01merr\u001b[39;00m\n\u001b[1;32m 1876\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m ser\u001b[38;5;241m.\u001b[39mdtype \u001b[38;5;241m==\u001b[39m \u001b[38;5;28mobject\u001b[39m:\n\u001b[1;32m 1877\u001b[0m res_values \u001b[38;5;241m=\u001b[39m res_values\u001b[38;5;241m.\u001b[39mastype(\u001b[38;5;28mobject\u001b[39m, copy\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mFalse\u001b[39;00m)\n",
"\u001b[0;31mTypeError\u001b[0m: agg function failed [how->mean,dtype->object]"
]
}
],
"source": [
"filmi_po_desetletjih.mean()"
]
},
{
"cell_type": "markdown",
"id": "f7899b92",
"metadata": {},
"source": [
"### Naloga\n",
"\n",
"Izračunajte število filmov posamezne dolžine, zaokrožene na 5 minut.\n",
"\n",
"## Risanje grafov\n",
"\n",
"Običajen graf dobimo z metodo `plot`. Uporabljamo ga, kadar želimo prikazati spreminjanje vrednosti v odvisnosti od zvezne spremenljivke. Naša hipoteza je, da so zlata leta filma mimo. Graf to zanika."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7668cadd",
"metadata": {},
"outputs": [],
"source": [
"filmi[filmi.ocena > 9].groupby('desetletje').size().plot()"
]
},
{
"cell_type": "markdown",
"id": "e01b2d35",
"metadata": {},
"source": [
"Razsevni diagram dobimo z metodo `plot.scatter`. Uporabljamo ga, če želimo ugotoviti povezavo med dvema spremenljivkama."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1a043155",
"metadata": {},
"outputs": [],
"source": [
"filmi.plot.scatter('ocena', 'metascore')"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "00652c3f",
"metadata": {},
"outputs": [],
"source": [
"filmi[filmi.dolzina < 250].plot.scatter('dolzina', 'ocena')"
]
},
{
"cell_type": "markdown",
"id": "1c1e940e",
"metadata": {},
"source": [
"Stolpčni diagram dobimo z metodo `plot.bar`. Uporabljamo ga, če želimo primerjati vrednosti pri diskretnih (običajno kategoričnih) spremenljivkah. Pogosto je koristno, da graf uredimo po vrednostih."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "3a015707",
"metadata": {},
"outputs": [],
"source": [
"filmi.sort_values('zasluzek', ascending=False).head(20).plot.bar(x='naslov', y='zasluzek')"
]
},
{
"cell_type": "markdown",
"id": "2085ae10",
"metadata": {},
"source": [
"### Naloga\n",
"\n",
"Narišite grafe, ki ustrezno kažejo:\n",
"\n",
"- Povezavo med IMDB in metascore oceno\n",
"- Spreminjanje povprečne dolžine filmov skozi leta\n",
"\n",
"## Stikanje"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "1e8bbfc2",
"metadata": {},
"outputs": [],
"source": [
"osebe = pd.read_csv('podatki/osebe.csv', index_col='id')\n",
"vloge = pd.read_csv('podatki/vloge.csv')\n",
"zanri = pd.read_csv('podatki/zanri.csv')"
]
},
{
"cell_type": "markdown",
"id": "b20ee006",
"metadata": {},
"source": [
"Razpredelnice stikamo s funkcijo `merge`, ki vrne razpredelnico vnosov iz obeh tabel, pri katerih se vsi istoimenski podatki ujemajo."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "cac8d36d",
"metadata": {},
"outputs": [],
"source": [
"vloge[vloge.film == 12349]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "007dbbd3",
"metadata": {},
"outputs": [],
"source": [
"zanri[zanri.film == 12349]"
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "f8f9c242",
"metadata": {},
"outputs": [],
"source": [
"pd.merge(vloge, zanri).head(20)"
]
},
{
"cell_type": "markdown",
"id": "f0f6ec46",
"metadata": {},
"source": [
"V osnovi vsebuje staknjena razpredelnica le tiste vnose, ki se pojavijo v obeh tabelah. Temu principu pravimo notranji stik (_inner join_). Lahko pa se odločimo, da izberemo tudi tiste vnose, ki imajo podatke le v levi tabeli (_left join_), le v desni tabeli (_right join_) ali v vsaj eni tabeli (_outer join_). Če v eni tabeli ni vnosov, bodo v staknjeni tabeli označene manjkajoče vrednosti. Ker smo v našem primeru podatke jemali iz IMDBja, kjer so za vsak film določeni tako žanri kot vloge, do razlik ne pride.\n",
"\n",
"Včasih želimo stikati tudi po stolpcih z različnimi imeni. V tem primeru funkciji `merge` podamo argumenta `left_on` in `right_on`."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "066dac1f",
"metadata": {},
"outputs": [],
"source": [
"pd.merge(pd.merge(vloge, zanri), osebe, left_on='oseba', right_on='id')"
]
},
{
"cell_type": "markdown",
"id": "31240600",
"metadata": {},
"source": [
"Poglejmo, katera osebe so nastopale v največ komedijah."
]
},
{
"cell_type": "code",
"execution_count": null,
"id": "7cb00e5a",
"metadata": {},
"outputs": [],
"source": [
"zanri_oseb = pd.merge(pd.merge(vloge, zanri), osebe, left_on='oseba', right_on='id')\n",
"zanri_oseb[\n",
" (zanri_oseb.zanr == 'Comedy') &\n",
" (zanri_oseb.vloga == 'I')\n",
"].groupby(\n",
" 'ime'\n",
").size(\n",
").sort_values(\n",
" ascending=False\n",
").head(20)"
]
},
{
"cell_type": "markdown",
"id": "ace0fad1",
"metadata": {},
"source": [
"### Naloga\n",
"\n",
"- Izračunajte povprečno oceno vsakega žanra.\n",
"- Kateri režiserji snemajo najdonosnejše filme?"
]
}
],
"metadata": {
"jupytext": {
"cell_metadata_filter": "-all",
"formats": "md:myst",
"text_representation": {
"extension": ".md",
"format_name": "myst",
"format_version": "0.8",
"jupytext_version": "1.5.0"
}
},
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.10.13"
},
"source_map": [
14,
22,
31,
37,
41,
43,
47,
49,
53,
55,
59,
61,
65,
67,
71,
75,
77,
83,
87,
91,
93,
99,
101,
107,
111,
114,
120,
124,
128,
131,
135,
139,
143,
145,
149,
151,
155,
157,
161,
163,
173,
175,
179,
183,
185,
189,
191,
202,
206,
210,
214,
218,
220,
226,
228,
232,
243
]
},
"nbformat": 4,
"nbformat_minor": 5
}