This R package provides miscellaneous tools for Finnish open government data to complement other rOpenGov packages with a more specific scope. We also maintain a todo list of further data sources to be added; your contributions and bug reports and other feedback are welcome! For further information, see the home page.
Installation (Asennus)
Finnish provinces (Maakuntatason informaatio)
Finnish municipalities (Kuntatason informaatio)
Finnish personal identification number (HETU) (Henkilotunnuksen kasittely)
Visualization tools (Visualisointirutiineja)
We assume you have installed R. If you use RStudio, change the default encoding to UTF-8. Linux users should also install CURL.
Install the stable release version in R:
install.packages("sorvi")
Test the installation by loading the library:
library(sorvi)
We also recommend setting the UTF-8 encoding:
Sys.setlocale(locale="UTF-8")
Brief examples of the package tools are provided below. Further examples are available in Louhos-blog and in our Rmarkdown blog.
Source: Wikipedia
tab <- get_province_info_wikipedia()
head(tab)
Finnish-English translations for province names (we have not been able to solve all encoding problems yet; suggestions very welcome!):
translations <- load_sorvi_data("translations")
head(translations)
Finnish municipality information is available through Statistics Finland (Tilastokeskus) and Land Survey Finland (Maanmittauslaitos). The row names for each data set are harmonized and can be used to match data sets from different sources, as different data sets may carry slightly different versions of certain municipality names.
Source: Maanmittauslaitos, MML.
municipality.info.mml <- get_municipality_info_mml()
municipality.info.mml[1:2,]
Source: Tilastokeskus
# Download Statfi municipality data
municipality.info.statfi <- get_municipality_info_statfi()
# List available information fields for municipalities
names(municipality.info.statfi)
Source: Wikipedia. The municipality names are provided also in plain ascii without special characters:
postal.code.table <- get_postal_code_info()
head(postal.code.table)
Map all municipalities to correponding provinces
m2p <- municipality_to_province()
head(m2p) # Just show the first ones
Map selected municipalities to correponding provinces:
municipality_to_province(c("Helsinki", "Tampere", "Turku"))
Speed up conversion with predefined info table:
m2p <- municipality_to_province(c("Helsinki", "Tampere", "Turku"), municipality.info.mml)
head(m2p)
Municipality name to code
convert_municipality_codes(municipalities = c("Turku", "Tampere"))
Municipality codes to names
convert_municipality_codes(ids = c(853, 837))
Complete conversion table
municipality_ids <- convert_municipality_codes()
head(municipality_ids) # just show the first entries
Extract information from a Finnish personal identification number:
library(sorvi)
hetu("111111-111C")
## $hetu
## [1] "111111-111C"
##
## $gender
## [1] "Male"
##
## $personal.number
## [1] 111
##
## $checksum
## [1] "C"
##
## $date
## [1] "1911-11-11"
##
## $day
## [1] 11
##
## $month
## [1] 11
##
## $year
## [1] 1911
##
## $century.char
## [1] "-"
##
## attr(,"class")
## [1] "hetu"
Validate Finnish personal identification number:
valid_hetu("010101-0101") # TRUE/FALSE
## [1] TRUE
Line fit with confidence smoothers (if any of the required libraries are missing, install them with the install.packages command in R):
library(sorvi)
library(plyr)
library(RColorBrewer)
library(ggplot2)
data(iris)
p <- regression_plot(Sepal.Length ~ Sepal.Width, iris)
print(p)
This work can be freely used, modified and distributed under the Two-clause BSD license.
Kindly cite the work as follows
citation("sorvi")
This vignette was created with
sessionInfo()
## R version 3.1.0 (2014-04-10)
## Platform: x86_64-pc-linux-gnu (64-bit)
##
## locale:
## [1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C
## [3] LC_TIME=en_US.UTF-8 LC_COLLATE=en_US.UTF-8
## [5] LC_MONETARY=en_US.UTF-8 LC_MESSAGES=en_US.UTF-8
## [7] LC_PAPER=en_US.UTF-8 LC_NAME=C
## [9] LC_ADDRESS=C LC_TELEPHONE=C
## [11] LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C
##
## attached base packages:
## [1] stats graphics grDevices utils datasets methods base
##
## other attached packages:
## [1] sorvi_0.6.23 pxR_0.40.0 plyr_1.8.1 RJSONIO_1.2-0.2
## [5] reshape2_1.4 stringr_0.6.2 reshape_0.8.5
##
## loaded via a namespace (and not attached):
## [1] colorspace_1.2-4 digest_0.6.4 evaluate_0.5.5
## [4] formatR_0.10 ggplot2_1.0.0 grid_3.1.0
## [7] gtable_0.1.2 knitr_1.6 MASS_7.3-33
## [10] munsell_0.4.2 proto_0.3-10 RColorBrewer_1.0-5
## [13] Rcpp_0.11.1 scales_0.2.4 tools_3.1.0
## [16] XML_3.98-1.1