{ "cells": [ { "cell_type": "markdown", "id": "40af5846", "metadata": {}, "source": [ "# Knjižnica Pandas\n", "\n", "Spodaj je pregled najosnovnejših metod, ki jih ponuja knjižnica Pandas. Vsaka od naštetih metod ponuja še cel kup dodatnih možnosti, ki so natančno opisane v [uradni dokumentaciji](http://pandas.pydata.org/pandas-docs/stable/). Z branjem dokumentacije se vam seveda najbolj splača začeti pri [uvodih](http://pandas.pydata.org/pandas-docs/stable/tutorials.html).\n", "\n", "## Predpriprava" ] }, { "cell_type": "code", "execution_count": 1, "id": "f5d8a416", "metadata": {}, "outputs": [], "source": [ "# naložimo paket\n", "import pandas as pd\n", "\n", "# naložimo razpredelnico, s katero bomo delali\n", "filmi = pd.read_csv('podatki/filmi.csv', index_col='id')\n", "\n", "# ker bomo delali z velikimi razpredelnicami, povemo, da naj se vedno izpiše le 20 vrstic\n", "pd.options.display.max_rows = 20" ] }, { "cell_type": "markdown", "id": "b1719851", "metadata": {}, "source": [ "## Osnovni izbori elementov razpredelnic\n", "\n", "Z metodo `.head(n=5)` pogledamo prvih `n`, z metodo `.tail(n=5)` pa zadnjih `n` vrstic razpredelnice." ] }, { "cell_type": "code", "execution_count": 2, "id": "dbc7384e", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
naslovdolzinaletoocenametascoreglasovizasluzekoznakaopis
id
4972The Birth of a Nation19519156.2NaN2489010000000.0TV-PGThe Stoneman family finds its friendship with ...
6864Intolerance19719167.799.0156702180000.0PassedThe story of a poor young woman separated by p...
9968Broken Blossoms9019197.3NaN10423NaNNot RatedA frail waif, abused by her brutal boxer fathe...
10323The Cabinet of Dr. Caligari6719208.0NaN64133NaNNot RatedHypnotist Dr. Caligari uses a somnambulist, Ce...
12349The Kid6819218.3NaN1265135450000.0PassedThe Tramp cares for an abandoned child, but ev...
12364The Phantom Carriage10719218.0NaN12624NaNNot RatedOn New Year's Eve, the driver of a ghostly car...
13442Nosferatu9419227.9NaN97589NaNNot RatedVampire Count Orlok expresses interest in a ne...
14341Our Hospitality6519237.8NaN114281172499.0PassedA man returns to his Appalachian homestead. On...
14429Safety Last!7419238.1NaN208871359903.0Not RatedA boy leaves his small country town and heads ...
15064The Last Laugh9019248.0NaN1415094812.0Not RatedAn aging doorman is forced to face the scorn o...
\n", "
" ], "text/plain": [ " naslov dolzina leto ocena metascore glasovi \\\n", "id \n", "4972 The Birth of a Nation 195 1915 6.2 NaN 24890 \n", "6864 Intolerance 197 1916 7.7 99.0 15670 \n", "9968 Broken Blossoms 90 1919 7.3 NaN 10423 \n", "10323 The Cabinet of Dr. Caligari 67 1920 8.0 NaN 64133 \n", "12349 The Kid 68 1921 8.3 NaN 126513 \n", "12364 The Phantom Carriage 107 1921 8.0 NaN 12624 \n", "13442 Nosferatu 94 1922 7.9 NaN 97589 \n", "14341 Our Hospitality 65 1923 7.8 NaN 11428 \n", "14429 Safety Last! 74 1923 8.1 NaN 20887 \n", "15064 The Last Laugh 90 1924 8.0 NaN 14150 \n", "\n", " zasluzek oznaka \\\n", "id \n", "4972 10000000.0 TV-PG \n", "6864 2180000.0 Passed \n", "9968 NaN Not Rated \n", "10323 NaN Not Rated \n", "12349 5450000.0 Passed \n", "12364 NaN Not Rated \n", "13442 NaN Not Rated \n", "14341 1172499.0 Passed \n", "14429 1359903.0 Not Rated \n", "15064 94812.0 Not Rated \n", "\n", " opis \n", "id \n", "4972 The Stoneman family finds its friendship with ... \n", "6864 The story of a poor young woman separated by p... \n", "9968 A frail waif, abused by her brutal boxer fathe... \n", "10323 Hypnotist Dr. Caligari uses a somnambulist, Ce... \n", "12349 The Tramp cares for an abandoned child, but ev... \n", "12364 On New Year's Eve, the driver of a ghostly car... \n", "13442 Vampire Count Orlok expresses interest in a ne... \n", "14341 A man returns to his Appalachian homestead. On... \n", "14429 A boy leaves his small country town and heads ... \n", "15064 An aging doorman is forced to face the scorn o... " ] }, "execution_count": 2, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi.head(10)" ] }, { "cell_type": "code", "execution_count": 3, "id": "48aea215", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
naslovdolzinaletoocenametascoreglasovizasluzekoznakaopis
id
18568902Kaun Pravin Tambe?13420228.4NaN10163NaNNaNAn indian cricketer who shows persistence and ...
18689424Batman v Superman: Dawn of Justice - Ultimate ...18220167.1NaN57662NaNRBatman is manipulated by Lex Luthor to fear Su...
18968540Incantation11020226.2NaN12366NaNTV-MASix years ago, Li Ronan was cursed after break...
20850406Sita Ramam16320228.5NaN38490NaNNaNAn orphan soldier, Lieutenant Ram's life chang...
21279138Maid in Malacañang11420223.9NaN15273NaNNaNThe Last Days of Ferdinand and Imelda Marcos t...
\n", "
" ], "text/plain": [ " naslov dolzina leto \\\n", "id \n", "18568902 Kaun Pravin Tambe? 134 2022 \n", "18689424 Batman v Superman: Dawn of Justice - Ultimate ... 182 2016 \n", "18968540 Incantation 110 2022 \n", "20850406 Sita Ramam 163 2022 \n", "21279138 Maid in Malacañang 114 2022 \n", "\n", " ocena metascore glasovi zasluzek oznaka \\\n", "id \n", "18568902 8.4 NaN 10163 NaN NaN \n", "18689424 7.1 NaN 57662 NaN R \n", "18968540 6.2 NaN 12366 NaN TV-MA \n", "20850406 8.5 NaN 38490 NaN NaN \n", "21279138 3.9 NaN 15273 NaN NaN \n", "\n", " opis \n", "id \n", "18568902 An indian cricketer who shows persistence and ... \n", "18689424 Batman is manipulated by Lex Luthor to fear Su... \n", "18968540 Six years ago, Li Ronan was cursed after break... \n", "20850406 An orphan soldier, Lieutenant Ram's life chang... \n", "21279138 The Last Days of Ferdinand and Imelda Marcos t... " ] }, "execution_count": 3, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi.tail()" ] }, { "cell_type": "markdown", "id": "f3993aa4", "metadata": {}, "source": [ "Z rezinami pa dostopamo do izbranih vrstic." ] }, { "cell_type": "code", "execution_count": 4, "id": "3f8575e8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
naslovdolzinaletoocenametascoreglasovizasluzekoznakaopis
id
10323The Cabinet of Dr. Caligari6719208.0NaN64133NaNNot RatedHypnotist Dr. Caligari uses a somnambulist, Ce...
12364The Phantom Carriage10719218.0NaN12624NaNNot RatedOn New Year's Eve, the driver of a ghostly car...
14341Our Hospitality6519237.8NaN114281172499.0PassedA man returns to his Appalachian homestead. On...
15064The Last Laugh9019248.0NaN1415094812.0Not RatedAn aging doorman is forced to face the scorn o...
\n", "
" ], "text/plain": [ " naslov dolzina leto ocena metascore glasovi \\\n", "id \n", "10323 The Cabinet of Dr. Caligari 67 1920 8.0 NaN 64133 \n", "12364 The Phantom Carriage 107 1921 8.0 NaN 12624 \n", "14341 Our Hospitality 65 1923 7.8 NaN 11428 \n", "15064 The Last Laugh 90 1924 8.0 NaN 14150 \n", "\n", " zasluzek oznaka opis \n", "id \n", "10323 NaN Not Rated Hypnotist Dr. Caligari uses a somnambulist, Ce... \n", "12364 NaN Not Rated On New Year's Eve, the driver of a ghostly car... \n", "14341 1172499.0 Passed A man returns to his Appalachian homestead. On... \n", "15064 94812.0 Not Rated An aging doorman is forced to face the scorn o... " ] }, "execution_count": 4, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi[3:10:2]" ] }, { "cell_type": "markdown", "id": "671eb18a", "metadata": {}, "source": [ "Z indeksiranjem razpredelnice dostopamo do posameznih stolpcev." ] }, { "cell_type": "code", "execution_count": 5, "id": "24345646", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "id\n", "4972 6.2\n", "6864 7.7\n", "9968 7.3\n", "10323 8.0\n", "12349 8.3\n", " ... \n", "18568902 8.4\n", "18689424 7.1\n", "18968540 6.2\n", "20850406 8.5\n", "21279138 3.9\n", "Name: ocena, Length: 9999, dtype: float64" ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi['ocena']" ] }, { "cell_type": "markdown", "id": "1848e350", "metadata": {}, "source": [ "Do stolpcev pogosto dostopamo, zato lahko uporabimo tudi krajši zapis." ] }, { "cell_type": "code", "execution_count": 6, "id": "981f233b", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "id\n", "4972 6.2\n", "6864 7.7\n", "9968 7.3\n", "10323 8.0\n", "12349 8.3\n", " ... \n", "18568902 8.4\n", "18689424 7.1\n", "18968540 6.2\n", "20850406 8.5\n", "21279138 3.9\n", "Name: ocena, Length: 9999, dtype: float64" ] }, "execution_count": 6, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi.ocena" ] }, { "cell_type": "markdown", "id": "e82b589a", "metadata": {}, "source": [ "Če želimo več stolpcev, moramo za indeks podati seznam vseh oznak." ] }, { "cell_type": "code", "execution_count": 7, "id": "bdd04c83", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
naslovocena
id
4972The Birth of a Nation6.2
6864Intolerance7.7
9968Broken Blossoms7.3
10323The Cabinet of Dr. Caligari8.0
12349The Kid8.3
.........
18568902Kaun Pravin Tambe?8.4
18689424Batman v Superman: Dawn of Justice - Ultimate ...7.1
18968540Incantation6.2
20850406Sita Ramam8.5
21279138Maid in Malacañang3.9
\n", "

9999 rows × 2 columns

\n", "
" ], "text/plain": [ " naslov ocena\n", "id \n", "4972 The Birth of a Nation 6.2\n", "6864 Intolerance 7.7\n", "9968 Broken Blossoms 7.3\n", "10323 The Cabinet of Dr. Caligari 8.0\n", "12349 The Kid 8.3\n", "... ... ...\n", "18568902 Kaun Pravin Tambe? 8.4\n", "18689424 Batman v Superman: Dawn of Justice - Ultimate ... 7.1\n", "18968540 Incantation 6.2\n", "20850406 Sita Ramam 8.5\n", "21279138 Maid in Malacañang 3.9\n", "\n", "[9999 rows x 2 columns]" ] }, "execution_count": 7, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi[['naslov', 'ocena']]" ] }, { "cell_type": "markdown", "id": "5c53107f", "metadata": {}, "source": [ "Do vrednosti z indeksom `i` dostopamo z `.iloc[i]`, do tiste s ključem `k` pa z `.loc[k]`." ] }, { "cell_type": "code", "execution_count": 8, "id": "32d7daeb", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "naslov Rebecca\n", "dolzina 130\n", "leto 1940\n", "ocena 8.1\n", "metascore 86.0\n", "glasovi 137358\n", "zasluzek 4360000.0\n", "oznaka Approved\n", "opis A self-conscious woman juggles adjusting to he...\n", "Name: 32976, dtype: object" ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi.iloc[120]" ] }, { "cell_type": "code", "execution_count": 9, "id": "9efdc106", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "naslov Indiana Jones and the Last Crusade\n", "dolzina 127\n", "leto 1989\n", "ocena 8.2\n", "metascore 65.0\n", "glasovi 750654\n", "zasluzek 197171806.0\n", "oznaka PG-13\n", "opis In 1938, after his father Professor Henry Jone...\n", "Name: 97576, dtype: object" ] }, "execution_count": 9, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi.loc[97576]" ] }, { "cell_type": "markdown", "id": "482938d9", "metadata": {}, "source": [ "## Filtriranje\n", "\n", "Izbor določenih vrstic razpredelnice naredimo tako, da za indeks podamo stolpec logičnih vrednosti, ki ga dobimo z običajnimi operacijami. V vrnjeni razpredelnici bodo ostale vrstice, pri katerih je v stolpcu vrednost `True`." ] }, { "cell_type": "code", "execution_count": 10, "id": "4edb5c50", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "id\n", "4972 False\n", "6864 False\n", "9968 False\n", "10323 True\n", "12349 True\n", " ... \n", "18568902 True\n", "18689424 False\n", "18968540 False\n", "20850406 True\n", "21279138 False\n", "Name: ocena, Length: 9999, dtype: bool" ] }, "execution_count": 10, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi.ocena >= 8" ] }, { "cell_type": "code", "execution_count": 11, "id": "87e652a1", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
naslovdolzinaletoocenametascoreglasovizasluzekoznakaopis
id
111161The Shawshank Redemption14219949.381.0265162528341469.0RTwo imprisoned men bond over a number of years...
15327088Kantara14820229.4NaN33294NaNNaNIt involves culture of Kambla and Bhootha Kola...
\n", "
" ], "text/plain": [ " naslov dolzina leto ocena metascore glasovi \\\n", "id \n", "111161 The Shawshank Redemption 142 1994 9.3 81.0 2651625 \n", "15327088 Kantara 148 2022 9.4 NaN 33294 \n", "\n", " zasluzek oznaka opis \n", "id \n", "111161 28341469.0 R Two imprisoned men bond over a number of years... \n", "15327088 NaN NaN It involves culture of Kambla and Bhootha Kola... " ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi[filmi.ocena >= 9.3]" ] }, { "cell_type": "code", "execution_count": 12, "id": "26aac5a2", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
naslovdolzinaletoocenametascoreglasovizasluzekoznakaopis
id
52077Plan 9 from Outer Space7919573.956.038744NaNNot RatedEvil aliens attack Earth and set their terribl...
54673The Beast of Yucca Flats5419611.8NaN11242NaNUnratedA defecting Soviet scientist is hit by a nucle...
58548Santa Claus Conquers the Martians8119642.6NaN11838NaNNot RatedThe Martians kidnap Santa Claus because there ...
59464Monster a Go-Go6819651.7NaN11138NaNTV-PGA space capsule crash-lands on Earth, and the ...
60666Manos: The Hands of Fate7019661.6NaN36445NaNNot RatedA family gets lost on the road and stumbles up...
..............................
15654262Chup13520228.4NaN13098NaNNaNA psychopath killer, targeting film critics. T...
16492678Demon Slayer: Kimetsu no Yaiba - Tsuzumi Mansi...8720219.0NaN12634NaNNaNTanjiro ventures to the south-southeast where ...
18568902Kaun Pravin Tambe?13420228.4NaN10163NaNNaNAn indian cricketer who shows persistence and ...
20850406Sita Ramam16320228.5NaN38490NaNNaNAn orphan soldier, Lieutenant Ram's life chang...
21279138Maid in Malacañang11420223.9NaN15273NaNNaNThe Last Days of Ferdinand and Imelda Marcos t...
\n", "

695 rows × 9 columns

\n", "
" ], "text/plain": [ " naslov dolzina leto \\\n", "id \n", "52077 Plan 9 from Outer Space 79 1957 \n", "54673 The Beast of Yucca Flats 54 1961 \n", "58548 Santa Claus Conquers the Martians 81 1964 \n", "59464 Monster a Go-Go 68 1965 \n", "60666 Manos: The Hands of Fate 70 1966 \n", "... ... ... ... \n", "15654262 Chup 135 2022 \n", "16492678 Demon Slayer: Kimetsu no Yaiba - Tsuzumi Mansi... 87 2021 \n", "18568902 Kaun Pravin Tambe? 134 2022 \n", "20850406 Sita Ramam 163 2022 \n", "21279138 Maid in Malacañang 114 2022 \n", "\n", " ocena metascore glasovi zasluzek oznaka \\\n", "id \n", "52077 3.9 56.0 38744 NaN Not Rated \n", "54673 1.8 NaN 11242 NaN Unrated \n", "58548 2.6 NaN 11838 NaN Not Rated \n", "59464 1.7 NaN 11138 NaN TV-PG \n", "60666 1.6 NaN 36445 NaN Not Rated \n", "... ... ... ... ... ... \n", "15654262 8.4 NaN 13098 NaN NaN \n", "16492678 9.0 NaN 12634 NaN NaN \n", "18568902 8.4 NaN 10163 NaN NaN \n", "20850406 8.5 NaN 38490 NaN NaN \n", "21279138 3.9 NaN 15273 NaN NaN \n", "\n", " opis \n", "id \n", "52077 Evil aliens attack Earth and set their terribl... \n", "54673 A defecting Soviet scientist is hit by a nucle... \n", "58548 The Martians kidnap Santa Claus because there ... \n", "59464 A space capsule crash-lands on Earth, and the ... \n", "60666 A family gets lost on the road and stumbles up... \n", "... ... \n", "15654262 A psychopath killer, targeting film critics. T... \n", "16492678 Tanjiro ventures to the south-southeast where ... \n", "18568902 An indian cricketer who shows persistence and ... \n", "20850406 An orphan soldier, Lieutenant Ram's life chang... \n", "21279138 The Last Days of Ferdinand and Imelda Marcos t... \n", "\n", "[695 rows x 9 columns]" ] }, "execution_count": 12, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi[(filmi.leto > 2010) & (filmi.ocena > 8) | (filmi.ocena < 5)]" ] }, { "cell_type": "markdown", "id": "c9acb1ca", "metadata": {}, "source": [ "### Naloga\n", "\n", "Poiščite filme, ki si jih želimo izogniti za vsako ceno, torej tiste, ki so daljši od dveh ur in imajo oceno pod 4." ] }, { "cell_type": "code", "execution_count": 13, "id": "581eda77", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
naslovdolzinaletoocenametascoreglasovizasluzekoznakaopis
id
118688Batman & Robin12519973.728.0253972107325195.0PG-13Batman and Robin try to keep their relationshi...
120179Speed 2: Cruise Control12119973.923.08171448608066.0PG-13A computer hacker breaks into the computer sys...
2574698Gunday15220142.6NaN59270NaNNot RatedThe lives of Calcutta's most powerful Gunday, ...
7886848Sadak 213320201.1NaN95865NaNTV-MAThe film picks up where Sadak left off, revolv...
10350922Laxmii14120202.6NaN57411NaNTV-MAAasif visits his wife's parents' house and hap...
10888594Radhe13520211.9NaN177814NaNTV-MAAfter taking the dreaded gangster Gani Bhai, A...
\n", "
" ], "text/plain": [ " naslov dolzina leto ocena metascore glasovi \\\n", "id \n", "118688 Batman & Robin 125 1997 3.7 28.0 253972 \n", "120179 Speed 2: Cruise Control 121 1997 3.9 23.0 81714 \n", "2574698 Gunday 152 2014 2.6 NaN 59270 \n", "7886848 Sadak 2 133 2020 1.1 NaN 95865 \n", "10350922 Laxmii 141 2020 2.6 NaN 57411 \n", "10888594 Radhe 135 2021 1.9 NaN 177814 \n", "\n", " zasluzek oznaka \\\n", "id \n", "118688 107325195.0 PG-13 \n", "120179 48608066.0 PG-13 \n", "2574698 NaN Not Rated \n", "7886848 NaN TV-MA \n", "10350922 NaN TV-MA \n", "10888594 NaN TV-MA \n", "\n", " opis \n", "id \n", "118688 Batman and Robin try to keep their relationshi... \n", "120179 A computer hacker breaks into the computer sys... \n", "2574698 The lives of Calcutta's most powerful Gunday, ... \n", "7886848 The film picks up where Sadak left off, revolv... \n", "10350922 Aasif visits his wife's parents' house and hap... \n", "10888594 After taking the dreaded gangster Gani Bhai, A... " ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi[(filmi.dolzina > 120) & (filmi.ocena < 4) & (filmi.glasovi > 50000)]" ] }, { "cell_type": "markdown", "id": "c7bf000b", "metadata": {}, "source": [ "## Urejanje\n", "\n", "Razpredelnico urejamo z metodo `.sort_values`, ki ji podamo ime ali seznam imen stolpcev, po katerih želimo urejati. Po želji lahko tudi povemo, kateri stolpci naj bodo urejeni naraščajoče in kateri padajoče." ] }, { "cell_type": "code", "execution_count": 14, "id": "0bab9d68", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
naslovdolzinaletoocenametascoreglasovizasluzekoznakaopis
id
2061702To the Forest of Firefly Lights4520117.8NaN18535NaNNaNHotaru is rescued by a spirit when she gets lo...
15324Sherlock Jr.4519248.2NaN50180977375.0PassedA film projectionist longs to be a detective, ...
2591814The Garden of Words4620137.4NaN44624NaNTV-14A 15-year-old boy and 27-year-old woman find a...
275230Blood: The Last Vampire4820006.644.012761NaNNot RatedSaya is a Japanese vampire slayer whose next m...
142236Dragon Ball Z: Revival Fusion5119957.6NaN11050NaNPGThe universe is thrown into dimensional chaos ...
..............................
107007Gettysburg27119937.6NaN2947910769960.0PGIn 1863, the Northern and Southern forces figh...
74084190031719767.770.025679NaNUnratedThe epic tale of a class struggle in twentieth...
1954470Gangs of Wasseypur32120128.289.096141NaNNot RatedA clash between Sultan and Shahid Khan leads t...
346336The Best of Youth36620038.589.022119274024.0RAn Italian epic that follows the lives of two ...
111341Satantango43919948.3NaN11214NaNNot RatedOn the eve of a large payment, residents of a ...
\n", "

9999 rows × 9 columns

\n", "
" ], "text/plain": [ " naslov dolzina leto ocena metascore \\\n", "id \n", "2061702 To the Forest of Firefly Lights 45 2011 7.8 NaN \n", "15324 Sherlock Jr. 45 1924 8.2 NaN \n", "2591814 The Garden of Words 46 2013 7.4 NaN \n", "275230 Blood: The Last Vampire 48 2000 6.6 44.0 \n", "142236 Dragon Ball Z: Revival Fusion 51 1995 7.6 NaN \n", "... ... ... ... ... ... \n", "107007 Gettysburg 271 1993 7.6 NaN \n", "74084 1900 317 1976 7.7 70.0 \n", "1954470 Gangs of Wasseypur 321 2012 8.2 89.0 \n", "346336 The Best of Youth 366 2003 8.5 89.0 \n", "111341 Satantango 439 1994 8.3 NaN \n", "\n", " glasovi zasluzek oznaka \\\n", "id \n", "2061702 18535 NaN NaN \n", "15324 50180 977375.0 Passed \n", "2591814 44624 NaN TV-14 \n", "275230 12761 NaN Not Rated \n", "142236 11050 NaN PG \n", "... ... ... ... \n", "107007 29479 10769960.0 PG \n", "74084 25679 NaN Unrated \n", "1954470 96141 NaN Not Rated \n", "346336 22119 274024.0 R \n", "111341 11214 NaN Not Rated \n", "\n", " opis \n", "id \n", "2061702 Hotaru is rescued by a spirit when she gets lo... \n", "15324 A film projectionist longs to be a detective, ... \n", "2591814 A 15-year-old boy and 27-year-old woman find a... \n", "275230 Saya is a Japanese vampire slayer whose next m... \n", "142236 The universe is thrown into dimensional chaos ... \n", "... ... \n", "107007 In 1863, the Northern and Southern forces figh... \n", "74084 The epic tale of a class struggle in twentieth... \n", "1954470 A clash between Sultan and Shahid Khan leads t... \n", "346336 An Italian epic that follows the lives of two ... \n", "111341 On the eve of a large payment, residents of a ... \n", "\n", "[9999 rows x 9 columns]" ] }, "execution_count": 14, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi.sort_values('dolzina')" ] }, { "cell_type": "code", "execution_count": 15, "id": "573492fc", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
naslovdolzinaletoocenametascoreglasovizasluzekoznakaopis
id
15327088Kantara14820229.4NaN33294NaNNaNIt involves culture of Kambla and Bhootha Kola...
111161The Shawshank Redemption14219949.381.0265162528341469.0RTwo imprisoned men bond over a number of years...
68646The Godfather17519729.2100.01838099134966411.0RThe aging patriarch of an organized crime dyna...
252487The Chaos Class8719759.2NaN40747NaNNaNLazy, uneducated students share a very close b...
5008312 Angry Men9619579.096.07829234360000.0ApprovedThe jury in a New York City murder trial is fr...
..............................
421051Daniel the Wizard8120041.2NaN14413NaNNot RatedEvil assassins want to kill Daniel Kublbock, t...
6038600Smolensk12020161.2NaN39704NaNNaNAn inspired story of people affected by the tr...
7886848Sadak 213320201.1NaN95865NaNTV-MAThe film picks up where Sadak left off, revolv...
5988370Reis10820171.0NaN73382NaNNaNA drama about the early life of Recep Tayyip E...
7221896Cumali Ceber10020171.0NaN38958NaNNaNCumali Ceber goes to a vacation with his child...
\n", "

9999 rows × 9 columns

\n", "
" ], "text/plain": [ " naslov dolzina leto ocena metascore glasovi \\\n", "id \n", "15327088 Kantara 148 2022 9.4 NaN 33294 \n", "111161 The Shawshank Redemption 142 1994 9.3 81.0 2651625 \n", "68646 The Godfather 175 1972 9.2 100.0 1838099 \n", "252487 The Chaos Class 87 1975 9.2 NaN 40747 \n", "50083 12 Angry Men 96 1957 9.0 96.0 782923 \n", "... ... ... ... ... ... ... \n", "421051 Daniel the Wizard 81 2004 1.2 NaN 14413 \n", "6038600 Smolensk 120 2016 1.2 NaN 39704 \n", "7886848 Sadak 2 133 2020 1.1 NaN 95865 \n", "5988370 Reis 108 2017 1.0 NaN 73382 \n", "7221896 Cumali Ceber 100 2017 1.0 NaN 38958 \n", "\n", " zasluzek oznaka \\\n", "id \n", "15327088 NaN NaN \n", "111161 28341469.0 R \n", "68646 134966411.0 R \n", "252487 NaN NaN \n", "50083 4360000.0 Approved \n", "... ... ... \n", "421051 NaN Not Rated \n", "6038600 NaN NaN \n", "7886848 NaN TV-MA \n", "5988370 NaN NaN \n", "7221896 NaN NaN \n", "\n", " opis \n", "id \n", "15327088 It involves culture of Kambla and Bhootha Kola... \n", "111161 Two imprisoned men bond over a number of years... \n", "68646 The aging patriarch of an organized crime dyna... \n", "252487 Lazy, uneducated students share a very close b... \n", "50083 The jury in a New York City murder trial is fr... \n", "... ... \n", "421051 Evil assassins want to kill Daniel Kublbock, t... \n", "6038600 An inspired story of people affected by the tr... \n", "7886848 The film picks up where Sadak left off, revolv... \n", "5988370 A drama about the early life of Recep Tayyip E... \n", "7221896 Cumali Ceber goes to a vacation with his child... \n", "\n", "[9999 rows x 9 columns]" ] }, "execution_count": 15, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# najprej uredi padajoče po oceni, pri vsaki oceni pa še naraščajoče po letu\n", "filmi.sort_values(['ocena', 'leto'], ascending=[False, True])" ] }, { "cell_type": "markdown", "id": "ee831609", "metadata": {}, "source": [ "## Združevanje\n", "\n", "Z metodo `.groupby` ustvarimo razpredelnico posebne vrste, v katerem so vrstice združene glede na skupno lastnost." ] }, { "cell_type": "code", "execution_count": 16, "id": "9f1340d9", "metadata": {}, "outputs": [], "source": [ "filmi_po_letih = filmi.groupby('leto')" ] }, { "cell_type": "code", "execution_count": 17, "id": "b3dfc27a", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "" ] }, "execution_count": 17, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi_po_letih" ] }, { "cell_type": "code", "execution_count": 18, "id": "96848f15", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "leto\n", "1915 6.200000\n", "1916 7.700000\n", "1919 7.300000\n", "1920 8.000000\n", "1921 8.150000\n", " ... \n", "2018 6.430748\n", "2019 6.493051\n", "2020 6.144304\n", "2021 6.369742\n", "2022 6.361628\n", "Name: ocena, Length: 106, dtype: float64" ] }, "execution_count": 18, "metadata": {}, "output_type": "execute_result" } ], "source": [ "# povprečna ocena vsakega leta\n", "filmi_po_letih.ocena.mean()" ] }, { "cell_type": "markdown", "id": "980188d2", "metadata": {}, "source": [ "Če želimo, lahko združujemo tudi po izračunanih lastnostih. Izračunajmo stolpec in ga shranimo v razpredelnico." ] }, { "cell_type": "code", "execution_count": 19, "id": "3db1909f", "metadata": {}, "outputs": [], "source": [ "filmi['desetletje'] = 10 * (filmi.leto // 10)" ] }, { "cell_type": "code", "execution_count": 20, "id": "1d1d4ed8", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
naslovdolzinaletoocenametascoreglasovizasluzekoznakaopisdesetletje
id
4972The Birth of a Nation19519156.2NaN2489010000000.0TV-PGThe Stoneman family finds its friendship with ...1910
6864Intolerance19719167.799.0156702180000.0PassedThe story of a poor young woman separated by p...1910
9968Broken Blossoms9019197.3NaN10423NaNNot RatedA frail waif, abused by her brutal boxer fathe...1910
10323The Cabinet of Dr. Caligari6719208.0NaN64133NaNNot RatedHypnotist Dr. Caligari uses a somnambulist, Ce...1920
12349The Kid6819218.3NaN1265135450000.0PassedThe Tramp cares for an abandoned child, but ev...1920
.................................
18568902Kaun Pravin Tambe?13420228.4NaN10163NaNNaNAn indian cricketer who shows persistence and ...2020
18689424Batman v Superman: Dawn of Justice - Ultimate ...18220167.1NaN57662NaNRBatman is manipulated by Lex Luthor to fear Su...2010
18968540Incantation11020226.2NaN12366NaNTV-MASix years ago, Li Ronan was cursed after break...2020
20850406Sita Ramam16320228.5NaN38490NaNNaNAn orphan soldier, Lieutenant Ram's life chang...2020
21279138Maid in Malacañang11420223.9NaN15273NaNNaNThe Last Days of Ferdinand and Imelda Marcos t...2020
\n", "

9999 rows × 10 columns

\n", "
" ], "text/plain": [ " naslov dolzina leto \\\n", "id \n", "4972 The Birth of a Nation 195 1915 \n", "6864 Intolerance 197 1916 \n", "9968 Broken Blossoms 90 1919 \n", "10323 The Cabinet of Dr. Caligari 67 1920 \n", "12349 The Kid 68 1921 \n", "... ... ... ... \n", "18568902 Kaun Pravin Tambe? 134 2022 \n", "18689424 Batman v Superman: Dawn of Justice - Ultimate ... 182 2016 \n", "18968540 Incantation 110 2022 \n", "20850406 Sita Ramam 163 2022 \n", "21279138 Maid in Malacañang 114 2022 \n", "\n", " ocena metascore glasovi zasluzek oznaka \\\n", "id \n", "4972 6.2 NaN 24890 10000000.0 TV-PG \n", "6864 7.7 99.0 15670 2180000.0 Passed \n", "9968 7.3 NaN 10423 NaN Not Rated \n", "10323 8.0 NaN 64133 NaN Not Rated \n", "12349 8.3 NaN 126513 5450000.0 Passed \n", "... ... ... ... ... ... \n", "18568902 8.4 NaN 10163 NaN NaN \n", "18689424 7.1 NaN 57662 NaN R \n", "18968540 6.2 NaN 12366 NaN TV-MA \n", "20850406 8.5 NaN 38490 NaN NaN \n", "21279138 3.9 NaN 15273 NaN NaN \n", "\n", " opis desetletje \n", "id \n", "4972 The Stoneman family finds its friendship with ... 1910 \n", "6864 The story of a poor young woman separated by p... 1910 \n", "9968 A frail waif, abused by her brutal boxer fathe... 1910 \n", "10323 Hypnotist Dr. Caligari uses a somnambulist, Ce... 1920 \n", "12349 The Tramp cares for an abandoned child, but ev... 1920 \n", "... ... ... \n", "18568902 An indian cricketer who shows persistence and ... 2020 \n", "18689424 Batman is manipulated by Lex Luthor to fear Su... 2010 \n", "18968540 Six years ago, Li Ronan was cursed after break... 2020 \n", "20850406 An orphan soldier, Lieutenant Ram's life chang... 2020 \n", "21279138 The Last Days of Ferdinand and Imelda Marcos t... 2020 \n", "\n", "[9999 rows x 10 columns]" ] }, "execution_count": 20, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi" ] }, { "cell_type": "code", "execution_count": 21, "id": "4f0fe759", "metadata": {}, "outputs": [], "source": [ "filmi_po_desetletjih = filmi.groupby('desetletje')" ] }, { "cell_type": "markdown", "id": "c0d39c61", "metadata": {}, "source": [ "Preštejemo, koliko filmov je bilo v vsakem desetletju. Pri večini stolpcev dobimo iste številke, ker imamo v vsakem stolpcu enako vnosov. Če kje kakšen podatek manjkal, je številka manjša." ] }, { "cell_type": "code", "execution_count": 22, "id": "46d8cd3b", "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
naslovdolzinaletoocenametascoreglasovizasluzekoznakaopis
desetletje
1910333313233
192027272727427182727
1930808080803980368080
19401341341341346313446133134
195020520520520511320592205205
1960284284284284172284150281284
1970410410410410323410276394410
1980823823823823721823711809823
1990142014201420142011281420132413991420
2000257525752575257521832575220325072575
2010335833583358335827283358235332283358
202068068068068050068047586680
\n", "
" ], "text/plain": [ " naslov dolzina leto ocena metascore glasovi zasluzek \\\n", "desetletje \n", "1910 3 3 3 3 1 3 2 \n", "1920 27 27 27 27 4 27 18 \n", "1930 80 80 80 80 39 80 36 \n", "1940 134 134 134 134 63 134 46 \n", "1950 205 205 205 205 113 205 92 \n", "1960 284 284 284 284 172 284 150 \n", "1970 410 410 410 410 323 410 276 \n", "1980 823 823 823 823 721 823 711 \n", "1990 1420 1420 1420 1420 1128 1420 1324 \n", "2000 2575 2575 2575 2575 2183 2575 2203 \n", "2010 3358 3358 3358 3358 2728 3358 2353 \n", "2020 680 680 680 680 500 680 47 \n", "\n", " oznaka opis \n", "desetletje \n", "1910 3 3 \n", "1920 27 27 \n", "1930 80 80 \n", "1940 133 134 \n", "1950 205 205 \n", "1960 281 284 \n", "1970 394 410 \n", "1980 809 823 \n", "1990 1399 1420 \n", "2000 2507 2575 \n", "2010 3228 3358 \n", "2020 586 680 " ] }, "execution_count": 22, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi_po_desetletjih.count()" ] }, { "cell_type": "markdown", "id": "217e0fe0", "metadata": {}, "source": [ "Če želimo dobiti le število članov posamezne skupine, uporabimo metodo `.size()`. V tem primeru dobimo le stolpec, ne razpredelnice." ] }, { "cell_type": "code", "execution_count": 23, "id": "74307cc2", "metadata": {}, "outputs": [ { "data": { "text/plain": [ "desetletje\n", "1910 3\n", "1920 27\n", "1930 80\n", "1940 134\n", "1950 205\n", "1960 284\n", "1970 410\n", "1980 823\n", "1990 1420\n", "2000 2575\n", "2010 3358\n", "2020 680\n", "dtype: int64" ] }, "execution_count": 23, "metadata": {}, "output_type": "execute_result" } ], "source": [ "filmi_po_desetletjih.size()" ] }, { "cell_type": "markdown", "id": "9ee12965", "metadata": {}, "source": [ "Pogledamo povprečja vsakega desetletja. Dobimo povprečno leto, dolžino, ocene in zaslužek. Povprečnega naslova ne dobimo, ker se ga ne da izračunati, zato ustreznega stolpca ni." ] }, { "cell_type": "code", "execution_count": 24, "id": "b5115c01", "metadata": {}, "outputs": [ { "ename": "TypeError", "evalue": "agg function failed [how->mean,dtype->object]", "output_type": "error", "traceback": [ "\u001b[0;31m---------------------------------------------------------------------------\u001b[0m", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/groupby.py:1870\u001b[0m, in \u001b[0;36mGroupBy._agg_py_fallback\u001b[0;34m(self, how, values, ndim, alt)\u001b[0m\n\u001b[1;32m 1869\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n\u001b[0;32m-> 1870\u001b[0m res_values \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mgrouper\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43magg_series\u001b[49m\u001b[43m(\u001b[49m\u001b[43mser\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43malt\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mpreserve_dtype\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mTrue\u001b[39;49;00m\u001b[43m)\u001b[49m\n\u001b[1;32m 1871\u001b[0m \u001b[38;5;28;01mexcept\u001b[39;00m \u001b[38;5;167;01mException\u001b[39;00m \u001b[38;5;28;01mas\u001b[39;00m err:\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/ops.py:850\u001b[0m, in \u001b[0;36mBaseGrouper.agg_series\u001b[0;34m(self, obj, func, preserve_dtype)\u001b[0m\n\u001b[1;32m 848\u001b[0m preserve_dtype \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;01mTrue\u001b[39;00m\n\u001b[0;32m--> 850\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_aggregate_series_pure_python\u001b[49m\u001b[43m(\u001b[49m\u001b[43mobj\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mfunc\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 852\u001b[0m npvalues \u001b[38;5;241m=\u001b[39m lib\u001b[38;5;241m.\u001b[39mmaybe_convert_objects(result, try_float\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mFalse\u001b[39;00m)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/ops.py:871\u001b[0m, in \u001b[0;36mBaseGrouper._aggregate_series_pure_python\u001b[0;34m(self, obj, func)\u001b[0m\n\u001b[1;32m 870\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m i, group \u001b[38;5;129;01min\u001b[39;00m \u001b[38;5;28menumerate\u001b[39m(splitter):\n\u001b[0;32m--> 871\u001b[0m res \u001b[38;5;241m=\u001b[39m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[43mgroup\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 872\u001b[0m res \u001b[38;5;241m=\u001b[39m extract_result(res)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/groupby.py:2376\u001b[0m, in \u001b[0;36mGroupBy.mean..\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 2373\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 2374\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_cython_agg_general(\n\u001b[1;32m 2375\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmean\u001b[39m\u001b[38;5;124m\"\u001b[39m,\n\u001b[0;32m-> 2376\u001b[0m alt\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mlambda\u001b[39;00m x: \u001b[43mSeries\u001b[49m\u001b[43m(\u001b[49m\u001b[43mx\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmean\u001b[49m\u001b[43m(\u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnumeric_only\u001b[49m\u001b[43m)\u001b[49m,\n\u001b[1;32m 2377\u001b[0m numeric_only\u001b[38;5;241m=\u001b[39mnumeric_only,\n\u001b[1;32m 2378\u001b[0m )\n\u001b[1;32m 2379\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m result\u001b[38;5;241m.\u001b[39m__finalize__(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mobj, method\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mgroupby\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/series.py:6226\u001b[0m, in \u001b[0;36mSeries.mean\u001b[0;34m(self, axis, skipna, numeric_only, **kwargs)\u001b[0m\n\u001b[1;32m 6218\u001b[0m \u001b[38;5;129m@doc\u001b[39m(make_doc(\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mmean\u001b[39m\u001b[38;5;124m\"\u001b[39m, ndim\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m1\u001b[39m))\n\u001b[1;32m 6219\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mmean\u001b[39m(\n\u001b[1;32m 6220\u001b[0m \u001b[38;5;28mself\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 6224\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs,\n\u001b[1;32m 6225\u001b[0m ):\n\u001b[0;32m-> 6226\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mNDFrame\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmean\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/generic.py:11969\u001b[0m, in \u001b[0;36mNDFrame.mean\u001b[0;34m(self, axis, skipna, numeric_only, **kwargs)\u001b[0m\n\u001b[1;32m 11962\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mmean\u001b[39m(\n\u001b[1;32m 11963\u001b[0m \u001b[38;5;28mself\u001b[39m,\n\u001b[1;32m 11964\u001b[0m axis: Axis \u001b[38;5;241m|\u001b[39m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;241m=\u001b[39m \u001b[38;5;241m0\u001b[39m,\n\u001b[0;32m (...)\u001b[0m\n\u001b[1;32m 11967\u001b[0m \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs,\n\u001b[1;32m 11968\u001b[0m ) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m Series \u001b[38;5;241m|\u001b[39m \u001b[38;5;28mfloat\u001b[39m:\n\u001b[0;32m> 11969\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_stat_function\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 11970\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mmean\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnanops\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mnanmean\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\n\u001b[1;32m 11971\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/generic.py:11926\u001b[0m, in \u001b[0;36mNDFrame._stat_function\u001b[0;34m(self, name, func, axis, skipna, numeric_only, **kwargs)\u001b[0m\n\u001b[1;32m 11924\u001b[0m validate_bool_kwarg(skipna, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mskipna\u001b[39m\u001b[38;5;124m\"\u001b[39m, none_allowed\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mFalse\u001b[39;00m)\n\u001b[0;32m> 11926\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_reduce\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 11927\u001b[0m \u001b[43m \u001b[49m\u001b[43mfunc\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mname\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mname\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnumeric_only\u001b[49m\n\u001b[1;32m 11928\u001b[0m \u001b[43m\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/series.py:6134\u001b[0m, in \u001b[0;36mSeries._reduce\u001b[0;34m(self, op, name, axis, skipna, numeric_only, filter_type, **kwds)\u001b[0m\n\u001b[1;32m 6130\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\n\u001b[1;32m 6131\u001b[0m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mSeries.\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mname\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m does not allow \u001b[39m\u001b[38;5;132;01m{\u001b[39;00mkwd_name\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m=\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mnumeric_only\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m \u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 6132\u001b[0m \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mwith non-numeric dtypes.\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 6133\u001b[0m )\n\u001b[0;32m-> 6134\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[43mop\u001b[49m\u001b[43m(\u001b[49m\u001b[43mdelegate\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwds\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/nanops.py:147\u001b[0m, in \u001b[0;36mbottleneck_switch.__call__..f\u001b[0;34m(values, axis, skipna, **kwds)\u001b[0m\n\u001b[1;32m 146\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m--> 147\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[43malt\u001b[49m\u001b[43m(\u001b[49m\u001b[43mvalues\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwds\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 149\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m result\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/nanops.py:404\u001b[0m, in \u001b[0;36m_datetimelike_compat..new_func\u001b[0;34m(values, axis, skipna, mask, **kwargs)\u001b[0m\n\u001b[1;32m 402\u001b[0m mask \u001b[38;5;241m=\u001b[39m isna(values)\n\u001b[0;32m--> 404\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[43mvalues\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43maxis\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43maxis\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mskipna\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mskipna\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mmask\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mmask\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 406\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m datetimelike:\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/nanops.py:720\u001b[0m, in \u001b[0;36mnanmean\u001b[0;34m(values, axis, skipna, mask)\u001b[0m\n\u001b[1;32m 719\u001b[0m the_sum \u001b[38;5;241m=\u001b[39m values\u001b[38;5;241m.\u001b[39msum(axis, dtype\u001b[38;5;241m=\u001b[39mdtype_sum)\n\u001b[0;32m--> 720\u001b[0m the_sum \u001b[38;5;241m=\u001b[39m \u001b[43m_ensure_numeric\u001b[49m\u001b[43m(\u001b[49m\u001b[43mthe_sum\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 722\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m axis \u001b[38;5;129;01mis\u001b[39;00m \u001b[38;5;129;01mnot\u001b[39;00m \u001b[38;5;28;01mNone\u001b[39;00m \u001b[38;5;129;01mand\u001b[39;00m \u001b[38;5;28mgetattr\u001b[39m(the_sum, \u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mndim\u001b[39m\u001b[38;5;124m\"\u001b[39m, \u001b[38;5;28;01mFalse\u001b[39;00m):\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/nanops.py:1693\u001b[0m, in \u001b[0;36m_ensure_numeric\u001b[0;34m(x)\u001b[0m\n\u001b[1;32m 1691\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m \u001b[38;5;28misinstance\u001b[39m(x, \u001b[38;5;28mstr\u001b[39m):\n\u001b[1;32m 1692\u001b[0m \u001b[38;5;66;03m# GH#44008, GH#36703 avoid casting e.g. strings to numeric\u001b[39;00m\n\u001b[0;32m-> 1693\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;167;01mTypeError\u001b[39;00m(\u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mCould not convert string \u001b[39m\u001b[38;5;124m'\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mx\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m'\u001b[39m\u001b[38;5;124m to numeric\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n\u001b[1;32m 1694\u001b[0m \u001b[38;5;28;01mtry\u001b[39;00m:\n", "\u001b[0;31mTypeError\u001b[0m: Could not convert string 'The Birth of a NationIntoleranceBroken Blossoms' to numeric", "\nThe above exception was the direct cause of the following exception:\n", "\u001b[0;31mTypeError\u001b[0m Traceback (most recent call last)", "Cell \u001b[0;32mIn[24], line 1\u001b[0m\n\u001b[0;32m----> 1\u001b[0m \u001b[43mfilmi_po_desetletjih\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmean\u001b[49m\u001b[43m(\u001b[49m\u001b[43m)\u001b[49m\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/groupby.py:2374\u001b[0m, in \u001b[0;36mGroupBy.mean\u001b[0;34m(self, numeric_only, engine, engine_kwargs)\u001b[0m\n\u001b[1;32m 2367\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_numba_agg_general(\n\u001b[1;32m 2368\u001b[0m grouped_mean,\n\u001b[1;32m 2369\u001b[0m executor\u001b[38;5;241m.\u001b[39mfloat_dtype_mapping,\n\u001b[1;32m 2370\u001b[0m engine_kwargs,\n\u001b[1;32m 2371\u001b[0m min_periods\u001b[38;5;241m=\u001b[39m\u001b[38;5;241m0\u001b[39m,\n\u001b[1;32m 2372\u001b[0m )\n\u001b[1;32m 2373\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[0;32m-> 2374\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_cython_agg_general\u001b[49m\u001b[43m(\u001b[49m\n\u001b[1;32m 2375\u001b[0m \u001b[43m \u001b[49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[38;5;124;43mmean\u001b[39;49m\u001b[38;5;124;43m\"\u001b[39;49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2376\u001b[0m \u001b[43m \u001b[49m\u001b[43malt\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[38;5;28;43;01mlambda\u001b[39;49;00m\u001b[43m \u001b[49m\u001b[43mx\u001b[49m\u001b[43m:\u001b[49m\u001b[43m \u001b[49m\u001b[43mSeries\u001b[49m\u001b[43m(\u001b[49m\u001b[43mx\u001b[49m\u001b[43m)\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mmean\u001b[49m\u001b[43m(\u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnumeric_only\u001b[49m\u001b[43m)\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2377\u001b[0m \u001b[43m \u001b[49m\u001b[43mnumeric_only\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mnumeric_only\u001b[49m\u001b[43m,\u001b[49m\n\u001b[1;32m 2378\u001b[0m \u001b[43m \u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 2379\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m result\u001b[38;5;241m.\u001b[39m__finalize__(\u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39mobj, method\u001b[38;5;241m=\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124mgroupby\u001b[39m\u001b[38;5;124m\"\u001b[39m)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/groupby.py:1925\u001b[0m, in \u001b[0;36mGroupBy._cython_agg_general\u001b[0;34m(self, how, alt, numeric_only, min_count, **kwargs)\u001b[0m\n\u001b[1;32m 1922\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_agg_py_fallback(how, values, ndim\u001b[38;5;241m=\u001b[39mdata\u001b[38;5;241m.\u001b[39mndim, alt\u001b[38;5;241m=\u001b[39malt)\n\u001b[1;32m 1923\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m result\n\u001b[0;32m-> 1925\u001b[0m new_mgr \u001b[38;5;241m=\u001b[39m \u001b[43mdata\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mgrouped_reduce\u001b[49m\u001b[43m(\u001b[49m\u001b[43marray_func\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1926\u001b[0m res \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_wrap_agged_manager(new_mgr)\n\u001b[1;32m 1927\u001b[0m out \u001b[38;5;241m=\u001b[39m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_wrap_aggregated_output(res)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/internals/managers.py:1428\u001b[0m, in \u001b[0;36mBlockManager.grouped_reduce\u001b[0;34m(self, func)\u001b[0m\n\u001b[1;32m 1424\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m blk\u001b[38;5;241m.\u001b[39mis_object:\n\u001b[1;32m 1425\u001b[0m \u001b[38;5;66;03m# split on object-dtype blocks bc some columns may raise\u001b[39;00m\n\u001b[1;32m 1426\u001b[0m \u001b[38;5;66;03m# while others do not.\u001b[39;00m\n\u001b[1;32m 1427\u001b[0m \u001b[38;5;28;01mfor\u001b[39;00m sb \u001b[38;5;129;01min\u001b[39;00m blk\u001b[38;5;241m.\u001b[39m_split():\n\u001b[0;32m-> 1428\u001b[0m applied \u001b[38;5;241m=\u001b[39m \u001b[43msb\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mapply\u001b[49m\u001b[43m(\u001b[49m\u001b[43mfunc\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1429\u001b[0m result_blocks \u001b[38;5;241m=\u001b[39m extend_blocks(applied, result_blocks)\n\u001b[1;32m 1430\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/internals/blocks.py:366\u001b[0m, in \u001b[0;36mBlock.apply\u001b[0;34m(self, func, **kwargs)\u001b[0m\n\u001b[1;32m 360\u001b[0m \u001b[38;5;129m@final\u001b[39m\n\u001b[1;32m 361\u001b[0m \u001b[38;5;28;01mdef\u001b[39;00m \u001b[38;5;21mapply\u001b[39m(\u001b[38;5;28mself\u001b[39m, func, \u001b[38;5;241m*\u001b[39m\u001b[38;5;241m*\u001b[39mkwargs) \u001b[38;5;241m-\u001b[39m\u001b[38;5;241m>\u001b[39m \u001b[38;5;28mlist\u001b[39m[Block]:\n\u001b[1;32m 362\u001b[0m \u001b[38;5;250m \u001b[39m\u001b[38;5;124;03m\"\"\"\u001b[39;00m\n\u001b[1;32m 363\u001b[0m \u001b[38;5;124;03m apply the function to my values; return a block if we are not\u001b[39;00m\n\u001b[1;32m 364\u001b[0m \u001b[38;5;124;03m one\u001b[39;00m\n\u001b[1;32m 365\u001b[0m \u001b[38;5;124;03m \"\"\"\u001b[39;00m\n\u001b[0;32m--> 366\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[43mfunc\u001b[49m\u001b[43m(\u001b[49m\u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mvalues\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[38;5;241;43m*\u001b[39;49m\u001b[43mkwargs\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 368\u001b[0m result \u001b[38;5;241m=\u001b[39m maybe_coerce_values(result)\n\u001b[1;32m 369\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m \u001b[38;5;28mself\u001b[39m\u001b[38;5;241m.\u001b[39m_split_op_result(result)\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/groupby.py:1922\u001b[0m, in \u001b[0;36mGroupBy._cython_agg_general..array_func\u001b[0;34m(values)\u001b[0m\n\u001b[1;32m 1919\u001b[0m \u001b[38;5;28;01melse\u001b[39;00m:\n\u001b[1;32m 1920\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m result\n\u001b[0;32m-> 1922\u001b[0m result \u001b[38;5;241m=\u001b[39m \u001b[38;5;28;43mself\u001b[39;49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43m_agg_py_fallback\u001b[49m\u001b[43m(\u001b[49m\u001b[43mhow\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mvalues\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43mndim\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43mdata\u001b[49m\u001b[38;5;241;43m.\u001b[39;49m\u001b[43mndim\u001b[49m\u001b[43m,\u001b[49m\u001b[43m \u001b[49m\u001b[43malt\u001b[49m\u001b[38;5;241;43m=\u001b[39;49m\u001b[43malt\u001b[49m\u001b[43m)\u001b[49m\n\u001b[1;32m 1923\u001b[0m \u001b[38;5;28;01mreturn\u001b[39;00m result\n", "File \u001b[0;32m/opt/hostedtoolcache/Python/3.10.13/x64/lib/python3.10/site-packages/pandas/core/groupby/groupby.py:1874\u001b[0m, in \u001b[0;36mGroupBy._agg_py_fallback\u001b[0;34m(self, how, values, ndim, alt)\u001b[0m\n\u001b[1;32m 1872\u001b[0m msg \u001b[38;5;241m=\u001b[39m \u001b[38;5;124mf\u001b[39m\u001b[38;5;124m\"\u001b[39m\u001b[38;5;124magg function failed [how->\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mhow\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m,dtype->\u001b[39m\u001b[38;5;132;01m{\u001b[39;00mser\u001b[38;5;241m.\u001b[39mdtype\u001b[38;5;132;01m}\u001b[39;00m\u001b[38;5;124m]\u001b[39m\u001b[38;5;124m\"\u001b[39m\n\u001b[1;32m 1873\u001b[0m \u001b[38;5;66;03m# preserve the kind of exception that raised\u001b[39;00m\n\u001b[0;32m-> 1874\u001b[0m \u001b[38;5;28;01mraise\u001b[39;00m \u001b[38;5;28mtype\u001b[39m(err)(msg) \u001b[38;5;28;01mfrom\u001b[39;00m \u001b[38;5;21;01merr\u001b[39;00m\n\u001b[1;32m 1876\u001b[0m \u001b[38;5;28;01mif\u001b[39;00m ser\u001b[38;5;241m.\u001b[39mdtype \u001b[38;5;241m==\u001b[39m \u001b[38;5;28mobject\u001b[39m:\n\u001b[1;32m 1877\u001b[0m res_values \u001b[38;5;241m=\u001b[39m res_values\u001b[38;5;241m.\u001b[39mastype(\u001b[38;5;28mobject\u001b[39m, copy\u001b[38;5;241m=\u001b[39m\u001b[38;5;28;01mFalse\u001b[39;00m)\n", "\u001b[0;31mTypeError\u001b[0m: agg function failed [how->mean,dtype->object]" ] } ], "source": [ "filmi_po_desetletjih.mean()" ] }, { "cell_type": "markdown", "id": "f7899b92", "metadata": {}, "source": [ "### Naloga\n", "\n", "Izračunajte število filmov posamezne dolžine, zaokrožene na 5 minut.\n", "\n", "## Risanje grafov\n", "\n", "Običajen graf dobimo z metodo `plot`. Uporabljamo ga, kadar želimo prikazati spreminjanje vrednosti v odvisnosti od zvezne spremenljivke. Naša hipoteza je, da so zlata leta filma mimo. Graf to zanika." ] }, { "cell_type": "code", "execution_count": null, "id": "7668cadd", "metadata": {}, "outputs": [], "source": [ "filmi[filmi.ocena > 9].groupby('desetletje').size().plot()" ] }, { "cell_type": "markdown", "id": "e01b2d35", "metadata": {}, "source": [ "Razsevni diagram dobimo z metodo `plot.scatter`. Uporabljamo ga, če želimo ugotoviti povezavo med dvema spremenljivkama." ] }, { "cell_type": "code", "execution_count": null, "id": "1a043155", "metadata": {}, "outputs": [], "source": [ "filmi.plot.scatter('ocena', 'metascore')" ] }, { "cell_type": "code", "execution_count": null, "id": "00652c3f", "metadata": {}, "outputs": [], "source": [ "filmi[filmi.dolzina < 250].plot.scatter('dolzina', 'ocena')" ] }, { "cell_type": "markdown", "id": "1c1e940e", "metadata": {}, "source": [ "Stolpčni diagram dobimo z metodo `plot.bar`. Uporabljamo ga, če želimo primerjati vrednosti pri diskretnih (običajno kategoričnih) spremenljivkah. Pogosto je koristno, da graf uredimo po vrednostih." ] }, { "cell_type": "code", "execution_count": null, "id": "3a015707", "metadata": {}, "outputs": [], "source": [ "filmi.sort_values('zasluzek', ascending=False).head(20).plot.bar(x='naslov', y='zasluzek')" ] }, { "cell_type": "markdown", "id": "2085ae10", "metadata": {}, "source": [ "### Naloga\n", "\n", "Narišite grafe, ki ustrezno kažejo:\n", "\n", "- Povezavo med IMDB in metascore oceno\n", "- Spreminjanje povprečne dolžine filmov skozi leta\n", "\n", "## Stikanje" ] }, { "cell_type": "code", "execution_count": null, "id": "1e8bbfc2", "metadata": {}, "outputs": [], "source": [ "osebe = pd.read_csv('podatki/osebe.csv', index_col='id')\n", "vloge = pd.read_csv('podatki/vloge.csv')\n", "zanri = pd.read_csv('podatki/zanri.csv')" ] }, { "cell_type": "markdown", "id": "b20ee006", "metadata": {}, "source": [ "Razpredelnice stikamo s funkcijo `merge`, ki vrne razpredelnico vnosov iz obeh tabel, pri katerih se vsi istoimenski podatki ujemajo." ] }, { "cell_type": "code", "execution_count": null, "id": "cac8d36d", "metadata": {}, "outputs": [], "source": [ "vloge[vloge.film == 12349]" ] }, { "cell_type": "code", "execution_count": null, "id": "007dbbd3", "metadata": {}, "outputs": [], "source": [ "zanri[zanri.film == 12349]" ] }, { "cell_type": "code", "execution_count": null, "id": "f8f9c242", "metadata": {}, "outputs": [], "source": [ "pd.merge(vloge, zanri).head(20)" ] }, { "cell_type": "markdown", "id": "f0f6ec46", "metadata": {}, "source": [ "V osnovi vsebuje staknjena razpredelnica le tiste vnose, ki se pojavijo v obeh tabelah. Temu principu pravimo notranji stik (_inner join_). Lahko pa se odločimo, da izberemo tudi tiste vnose, ki imajo podatke le v levi tabeli (_left join_), le v desni tabeli (_right join_) ali v vsaj eni tabeli (_outer join_). Če v eni tabeli ni vnosov, bodo v staknjeni tabeli označene manjkajoče vrednosti. Ker smo v našem primeru podatke jemali iz IMDBja, kjer so za vsak film določeni tako žanri kot vloge, do razlik ne pride.\n", "\n", "Včasih želimo stikati tudi po stolpcih z različnimi imeni. V tem primeru funkciji `merge` podamo argumenta `left_on` in `right_on`." ] }, { "cell_type": "code", "execution_count": null, "id": "066dac1f", "metadata": {}, "outputs": [], "source": [ "pd.merge(pd.merge(vloge, zanri), osebe, left_on='oseba', right_on='id')" ] }, { "cell_type": "markdown", "id": "31240600", "metadata": {}, "source": [ "Poglejmo, katera osebe so nastopale v največ komedijah." ] }, { "cell_type": "code", "execution_count": null, "id": "7cb00e5a", "metadata": {}, "outputs": [], "source": [ "zanri_oseb = pd.merge(pd.merge(vloge, zanri), osebe, left_on='oseba', right_on='id')\n", "zanri_oseb[\n", " (zanri_oseb.zanr == 'Comedy') &\n", " (zanri_oseb.vloga == 'I')\n", "].groupby(\n", " 'ime'\n", ").size(\n", ").sort_values(\n", " ascending=False\n", ").head(20)" ] }, { "cell_type": "markdown", "id": "ace0fad1", "metadata": {}, "source": [ "### Naloga\n", "\n", "- Izračunajte povprečno oceno vsakega žanra.\n", "- Kateri režiserji snemajo najdonosnejše filme?" ] } ], "metadata": { "jupytext": { "cell_metadata_filter": "-all", "formats": "md:myst", "text_representation": { "extension": ".md", "format_name": "myst", "format_version": "0.8", "jupytext_version": "1.5.0" } }, "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.10.13" }, "source_map": [ 14, 22, 31, 37, 41, 43, 47, 49, 53, 55, 59, 61, 65, 67, 71, 75, 77, 83, 87, 91, 93, 99, 101, 107, 111, 114, 120, 124, 128, 131, 135, 139, 143, 145, 149, 151, 155, 157, 161, 163, 173, 175, 179, 183, 185, 189, 191, 202, 206, 210, 214, 218, 220, 226, 228, 232, 243 ] }, "nbformat": 4, "nbformat_minor": 5 }