{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# LOC Colors - Data Management" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "*Export data as minimal JSON files - only the essentials to create the swatches in the browser*" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To use the created swatches for [https://labs.onb.ac.at/en/topic/akon-swatches/](https://labs.onb.ac.at/en/topic/akon-swatches/), I have to create a JSON file looking like this:\n", "\n", "```json\n", "[[\"AK111_461\", [\"#f4e6cd\", \"#cac4b2\", \"#7e8077\", \"#3e4139\", \"#2f3431\", \"#000304\"], \"Nonza\", \"gelaufen 1903\"],\n", "[\"AK111_072\", [\"#e2d7c1\", \"#a19c8f\", \"#504e42\", \"#494a44\", \"#010500\", \"#393c39\"], \"Kirchberg am Walde\", \"gelaufen 1914\"],\n", "[\"AK111_077\", [\"#454234\", \"#3e3b1f\", \"#7f7e77\", \"#a9b8be\", \"#3b4347\", \"#425a6b\"], \"Kirchberg am Wechsel\", \"gelaufen 1913\"]]\n", "```" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The first part (the id and the colors) are part of the created swatches, the second part (a name and an approximate date) are part of the metadata to download here: [https://labs.onb.ac.at/gitlab/labs-team/raw-metadata/raw/master/akon_postcards_public_domain.csv.bz2?inline=false](https://labs.onb.ac.at/gitlab/labs-team/raw-metadata/raw/master/akon_postcards_public_domain.csv.bz2?inline=false)." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Read created swatches" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [], "source": [ "import pandas as pd" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [], "source": [ "df = pd.read_csv('historic_postcards_color_swatches.csv.bz2', compression='bz2')" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0akon_idimage_linkhex_colorshtml
2191421914AK036_405https://iiif.onb.ac.at/images/AKON/AK036_405/4...['#f2e3c1', '#e6dec6', '#8e8a7a', '#7b7864', '...<a href=\"https://iiif.onb.ac.at/images/AKON/AK...
\n", "
" ], "text/plain": [ " Unnamed: 0 akon_id \\\n", "21914 21914 AK036_405 \n", "\n", " image_link \\\n", "21914 https://iiif.onb.ac.at/images/AKON/AK036_405/4... \n", "\n", " hex_colors \\\n", "21914 ['#f2e3c1', '#e6dec6', '#8e8a7a', '#7b7864', '... \n", "\n", " html \n", "21914 \n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
akon_idhex_colors
14980AK010_595['#cfbfa4', '#62583b', '#c8c7b6', '#4f5144', '...
7474AK084_243['#aca693', '#414028', '#444537', '#5c5f57', '...
23730AK043_578['#4e502c', '#444735', '#51554a', '#dae1d1', '...
30352AK085_096['#b5aa9d', '#f3e7d7', '#756d62', '#211a0f', '...
22389AK038_067['#f2e9cb', '#989384', '#545245', '#fcf7db', '...
\n", "" ], "text/plain": [ " akon_id hex_colors\n", "14980 AK010_595 ['#cfbfa4', '#62583b', '#c8c7b6', '#4f5144', '...\n", "7474 AK084_243 ['#aca693', '#414028', '#444537', '#5c5f57', '...\n", "23730 AK043_578 ['#4e502c', '#444735', '#51554a', '#dae1d1', '...\n", "30352 AK085_096 ['#b5aa9d', '#f3e7d7', '#756d62', '#211a0f', '...\n", "22389 AK038_067 ['#f2e9cb', '#989384', '#545245', '#fcf7db', '..." ] }, "execution_count": 5, "metadata": {}, "output_type": "execute_result" } ], "source": [ "id_and_hex_colors.sample(5)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Parse Color Array" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In order to properly export the color array as a JSON array later, convert the data representation slightly." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [], "source": [ "import json" ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "id_and_hex_colors['colors'] = id_and_hex_colors['hex_colors'].apply(\n", " lambda c: json.loads(c.replace(\"'\", '\"'))\n", ")" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
akon_idhex_colorscolors
1291AK116_455['#f4ece0', '#d4d2ce', '#716d5c', '#65665e', '...[#f4ece0, #d4d2ce, #716d5c, #65665e, #b2b4bb, ...
\n", "
" ], "text/plain": [ " akon_id hex_colors \\\n", "1291 AK116_455 ['#f4ece0', '#d4d2ce', '#716d5c', '#65665e', '... \n", "\n", " colors \n", "1291 [#f4ece0, #d4d2ce, #716d5c, #65665e, #b2b4bb, ... " ] }, "execution_count": 8, "metadata": {}, "output_type": "execute_result" } ], "source": [ "id_and_hex_colors.sample()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Do you see the subtle difference? The entry in the colors column is now an array with strings _without_ the single quotes `'`." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "id_and_colors = id_and_hex_colors[['akon_id', 'colors']].copy()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Add Metadata From Original Records" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Next up: Combining the id and the color array with names and dates. Read the metadata dump directly from the link found on [https://labs.onb.ac.at/en/dataset/akon/](https://labs.onb.ac.at/en/dataset/akon/):" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/conda/lib/python3.6/site-packages/IPython/core/interactiveshell.py:2785: DtypeWarning: Columns (13) have mixed types. Specify dtype option on import or set low_memory=False.\n", " interactivity=interactivity, compiler=compiler, result=result)\n" ] } ], "source": [ "original = pd.read_csv('https://labs.onb.ac.at/gitlab/labs-team/raw-metadata/raw/master/akon_postcards_public_domain.csv.bz2?inline=false', compression='bz2')" ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
akon_idnamedate
16251AK016_087Neulengbachvor 1907
23864AK044_401Gloriette1907
33292AK085_517Milanovor 1905
13875AK008_070Schloß Schönbrunn1906
12223AK004_040Pürgg1909
4129AK125_097Altausseegelaufen 1901
23756AK044_080Radstädter Tauern1907
26479AK069_067Attersee1906
14251AK109_329Josefsthalvor 1907
28518AK063_083Gaußig1908
\n", "
" ], "text/plain": [ " akon_id name date\n", "16251 AK016_087 Neulengbach vor 1907\n", "23864 AK044_401 Gloriette 1907\n", "33292 AK085_517 Milano vor 1905\n", "13875 AK008_070 Schloß Schönbrunn 1906\n", "12223 AK004_040 Pürgg 1909\n", "4129 AK125_097 Altaussee gelaufen 1901\n", "23756 AK044_080 Radstädter Tauern 1907\n", "26479 AK069_067 Attersee 1906\n", "14251 AK109_329 Josefsthal vor 1907\n", "28518 AK063_083 Gaußig 1908" ] }, "execution_count": 11, "metadata": {}, "output_type": "execute_result" } ], "source": [ "original[['akon_id', 'name', 'date']].sample(10)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "These are the columns needed." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [], "source": [ "original_info = original[['akon_id', 'name', 'date']].copy()" ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
akon_idnamedate
24871AK048_377Reichenau an der Rax1925
4191AK054_543Aflenz Kurortvor 1905
\n", "
" ], "text/plain": [ " akon_id name date\n", "24871 AK048_377 Reichenau an der Rax 1925\n", "4191 AK054_543 Aflenz Kurort vor 1905" ] }, "execution_count": 13, "metadata": {}, "output_type": "execute_result" } ], "source": [ "original_info.sample(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas offers a handy function for merging two dataframes _not on the index_, but on a shared column:" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "colors_and_info = pd.merge(id_and_colors, original_info, on='akon_id')" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
akon_idcolorsnamedate
16356AK016_310[#5f5a57, #9c938c, #4d453f, #cabbab, #3c3b39, ...Kapellen1908
1117AK116_129[#dcc9ab, #cec0a7, #a8a290, #48473a, #6c6c62, ...Garstengelaufen 1902
\n", "
" ], "text/plain": [ " akon_id colors name \\\n", "16356 AK016_310 [#5f5a57, #9c938c, #4d453f, #cabbab, #3c3b39, ... Kapellen \n", "1117 AK116_129 [#dcc9ab, #cec0a7, #a8a290, #48473a, #6c6c62, ... Garsten \n", "\n", " date \n", "16356 1908 \n", "1117 gelaufen 1902 " ] }, "execution_count": 16, "metadata": {}, "output_type": "execute_result" } ], "source": [ "colors_and_info.sample(2)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's exactly what's needed." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Save to JSON-File" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Pandas can export the data exactly in the target format:" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [], "source": [ "colors_and_info.to_json('historic_postcards__id_colors_name_date.json', orient='values')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Extract Subset" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "That's almost all. For the preview on [https://labs.onb.ac.at/en/dataset/akon/](https://labs.onb.ac.at/en/dataset/akon/) I need a subset of 100 swatches and save them:" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [], "source": [ "id_and_colors_100 = colors_and_info.iloc[:100]" ] }, { "cell_type": "code", "execution_count": 19, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
akon_idcolorsnamedate
99AK111_207[#403e2c, #f9f8eb, #7e7f6b, #35362d, #32342f, ...Klosterneuburggelaufen 1908
34AK111_293[#e6d8bf, #c7c1b1, #4a4c3c, #aeafa5, #464740, ...Komotauvor 1905
86AK111_184[#f7edd6, #d1cbb8, #f6f0db, #2d2a1b, #9e9f93, ...Klausenburggelaufen 1904
3AK111_026[#e2cba6, #9e8e73, #574c39, #3c311a, #4c473b, ...Kierling1922
39AK111_072[#e5d9c2, #c0baac, #928e81, #4c493e, #484943, ...Kirchberg am Waldegelaufen 1914
\n", "
" ], "text/plain": [ " akon_id colors \\\n", "99 AK111_207 [#403e2c, #f9f8eb, #7e7f6b, #35362d, #32342f, ... \n", "34 AK111_293 [#e6d8bf, #c7c1b1, #4a4c3c, #aeafa5, #464740, ... \n", "86 AK111_184 [#f7edd6, #d1cbb8, #f6f0db, #2d2a1b, #9e9f93, ... \n", "3 AK111_026 [#e2cba6, #9e8e73, #574c39, #3c311a, #4c473b, ... \n", "39 AK111_072 [#e5d9c2, #c0baac, #928e81, #4c493e, #484943, ... \n", "\n", " name date \n", "99 Klosterneuburg gelaufen 1908 \n", "34 Komotau vor 1905 \n", "86 Klausenburg gelaufen 1904 \n", "3 Kierling 1922 \n", "39 Kirchberg am Walde gelaufen 1914 " ] }, "execution_count": 19, "metadata": {}, "output_type": "execute_result" } ], "source": [ "id_and_colors_100.sample(5)" ] }, { "cell_type": "code", "execution_count": 20, "metadata": {}, "outputs": [], "source": [ "id_and_colors_100.to_json('historic_postcards__id_colors_name_date__100.json', orient='values')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "And done!" ] }, { "cell_type": "markdown", "metadata": {}, "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Below there's a fast-forward, compact version of what's been done above. No need to do all this again." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "# Compact (with other data source)" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Load Data" ] }, { "cell_type": "code", "execution_count": 65, "metadata": {}, "outputs": [ { "name": "stderr", "output_type": "stream", "text": [ "/opt/conda/lib/python3.6/site-packages/IPython/core/interactiveshell.py:2785: DtypeWarning: Columns (13) have mixed types. Specify dtype option on import or set low_memory=False.\n", " interactivity=interactivity, compiler=compiler, result=result)\n" ] } ], "source": [ "colors_hsv_clip = pd.read_csv('akon_with_hsv_clip50_color_swatches.csv.bz2', compression='bz2')\n", "raw_data = pd.read_csv('akon_postcards_public_domain_1925.csv.bz2', compression='bz2')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## View Data Format" ] }, { "cell_type": "code", "execution_count": 66, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0akon_idimage_linkhex_colorshtml
1191411914AK003_285https://iiif.onb.ac.at/images/AKON/AK003_285/2...['#050300', '#eee2c9', '#b6af9e', '#fdf7da', '...<a href=\"https://iiif.onb.ac.at/images/AKON/AK...
\n", "
" ], "text/plain": [ " Unnamed: 0 akon_id \\\n", "11914 11914 AK003_285 \n", "\n", " image_link \\\n", "11914 https://iiif.onb.ac.at/images/AKON/AK003_285/2... \n", "\n", " hex_colors \\\n", "11914 ['#050300', '#eee2c9', '#b6af9e', '#fdf7da', '... \n", "\n", " html \n", "11914
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
Unnamed: 0akon_ididaltitudebuildingcitycolorcommentmountainother...feature_classfeature_codegeoname_idlatitudelongitudenamecountry_idadmin_name_1admin_code_1geo
2343523435AK042_53325265434.0NaNFrohnleitenFalseNaNNaNNaN...PPPLA32779202.047.2666715.31667FrohnleitenATNaNNaN47.26667, 15.31667
\n", "

1 rows × 30 columns

\n", "" ], "text/plain": [ " Unnamed: 0 akon_id id altitude building city color \\\n", "23435 23435 AK042_533 25265 434.0 NaN Frohnleiten False \n", "\n", " comment mountain other ... feature_class feature_code \\\n", "23435 NaN NaN NaN ... P PPLA3 \n", "\n", " geoname_id latitude longitude name country_id admin_name_1 \\\n", "23435 2779202.0 47.26667 15.31667 Frohnleiten AT NaN \n", "\n", " admin_code_1 geo \n", "23435 NaN 47.26667, 15.31667 \n", "\n", "[1 rows x 30 columns]" ] }, "execution_count": 67, "metadata": {}, "output_type": "execute_result" } ], "source": [ "raw_data.sample()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Combine Data" ] }, { "cell_type": "code", "execution_count": 68, "metadata": {}, "outputs": [], "source": [ "combined_data = pd.merge(colors_hsv_clip[['akon_id', 'hex_colors', 'image_link']],\n", " raw_data[['akon_id', 'name', 'date']],\n", " on='akon_id')" ] }, { "cell_type": "code", "execution_count": 69, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
akon_idhex_colorsimage_linknamedate
23217AK041_595['#ada896', '#fcf6d5', '#767467', '#484739', '...https://iiif.onb.ac.at/images/AKON/AK041_595/5...Ötscher1909
\n", "
" ], "text/plain": [ " akon_id hex_colors \\\n", "23217 AK041_595 ['#ada896', '#fcf6d5', '#767467', '#484739', '... \n", "\n", " image_link name date \n", "23217 https://iiif.onb.ac.at/images/AKON/AK041_595/5... Ötscher 1909 " ] }, "execution_count": 69, "metadata": {}, "output_type": "execute_result" } ], "source": [ "combined_data.sample()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Flatten hex_colors" ] }, { "cell_type": "code", "execution_count": 70, "metadata": {}, "outputs": [], "source": [ "combined_data['hex_colors_list'] = combined_data['hex_colors'].apply(lambda c: json.loads(c.replace(\"'\", '\"')))" ] }, { "cell_type": "code", "execution_count": 71, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
akon_idhex_colorsimage_linknamedatehex_colors_list
15996AK014_589['#020100', '#fbfae8', '#88887e', '#64645a', '...https://iiif.onb.ac.at/images/AKON/AK014_589/5...Maria Taferl1909[#020100, #fbfae8, #88887e, #64645a, #4d4f49, ...
\n", "
" ], "text/plain": [ " akon_id hex_colors \\\n", "15996 AK014_589 ['#020100', '#fbfae8', '#88887e', '#64645a', '... \n", "\n", " image_link name date \\\n", "15996 https://iiif.onb.ac.at/images/AKON/AK014_589/5... Maria Taferl 1909 \n", "\n", " hex_colors_list \n", "15996 [#020100, #fbfae8, #88887e, #64645a, #4d4f49, ... " ] }, "execution_count": 71, "metadata": {}, "output_type": "execute_result" } ], "source": [ "combined_data.sample()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sanitize and Reorder" ] }, { "cell_type": "code", "execution_count": 73, "metadata": {}, "outputs": [], "source": [ "combined_data = combined_data.drop(columns=['hex_colors']).copy()" ] }, { "cell_type": "code", "execution_count": 74, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
akon_idimage_linknamedatehex_colors_list
19590AK028_177https://iiif.onb.ac.at/images/AKON/AK028_177/1...Frohnleiten1906[#020100, #a8a599, #7b7a6f, #fbf9e5, #4b4b40, ...
\n", "
" ], "text/plain": [ " akon_id image_link \\\n", "19590 AK028_177 https://iiif.onb.ac.at/images/AKON/AK028_177/1... \n", "\n", " name date hex_colors_list \n", "19590 Frohnleiten 1906 [#020100, #a8a599, #7b7a6f, #fbf9e5, #4b4b40, ... " ] }, "execution_count": 74, "metadata": {}, "output_type": "execute_result" } ], "source": [ "combined_data.sample()" ] }, { "cell_type": "code", "execution_count": 78, "metadata": {}, "outputs": [], "source": [ "combined_data = combined_data.rename(columns={'hex_colors_list': 'hex_colors'})" ] }, { "cell_type": "code", "execution_count": 79, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
akon_idimage_linknamedatehex_colors
33304AK087_042https://iiif.onb.ac.at/images/AKON/AK087_042/0...Abcoudevor 1905[#f8eacd, #aca391, #5b5747, #6e6a5e, #525148, ...
\n", "
" ], "text/plain": [ " akon_id image_link name \\\n", "33304 AK087_042 https://iiif.onb.ac.at/images/AKON/AK087_042/0... Abcoude \n", "\n", " date hex_colors \n", "33304 vor 1905 [#f8eacd, #aca391, #5b5747, #6e6a5e, #525148, ... " ] }, "execution_count": 79, "metadata": {}, "output_type": "execute_result" } ], "source": [ "combined_data.sample()" ] }, { "cell_type": "code", "execution_count": 81, "metadata": {}, "outputs": [ { "data": { "text/html": [ "
\n", "\n", "\n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", " \n", "
akon_idhex_colorsimage_linknamedate
25575AK031_287[#444626, #caccbc, #4a4d41, #48504f, #5b7073, ...https://iiif.onb.ac.at/images/AKON/AK031_287/2...Ebensee1907
\n", "
" ], "text/plain": [ " akon_id hex_colors \\\n", "25575 AK031_287 [#444626, #caccbc, #4a4d41, #48504f, #5b7073, ... \n", "\n", " image_link name date \n", "25575 https://iiif.onb.ac.at/images/AKON/AK031_287/2... Ebensee 1907 " ] }, "execution_count": 81, "metadata": {}, "output_type": "execute_result" } ], "source": [ "combined_data = combined_data[['akon_id', 'hex_colors', 'image_link', 'name', 'date']]\n", "combined_data.sample()" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Sample and Write" ] }, { "cell_type": "code", "execution_count": 82, "metadata": {}, "outputs": [], "source": [ "combined_data.iloc[:100].to_json('swatches_100.json', orient='values')" ] }, { "cell_type": "code", "execution_count": 83, "metadata": {}, "outputs": [], "source": [ "combined_data.to_json('swatches_all.json', orient='values')" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Alternate Data Format Without Link" ] }, { "cell_type": "code", "execution_count": 84, "metadata": {}, "outputs": [], "source": [ "sans_link = combined_data[['akon_id', 'hex_colors', 'name', 'date']]" ] }, { "cell_type": "code", "execution_count": 85, "metadata": {}, "outputs": [], "source": [ "sans_link.iloc[:100].to_json('swatches_100_nolink.json', orient='values')" ] }, { "cell_type": "code", "execution_count": 86, "metadata": {}, "outputs": [], "source": [ "sans_link.to_json('swatches_all_nolink.json', orient='values')" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [] } ], "metadata": { "kernelspec": { "display_name": "Python 3 User Default", "language": "python", "name": "python_3_user_default" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.5" } }, "nbformat": 4, "nbformat_minor": 2 }