From b27f8ef1387509c4b9d778505d185f8e0a541729 Mon Sep 17 00:00:00 2001 From: emkael Date: Wed, 4 Mar 2015 16:38:34 +0100 Subject: * README files --- README.md | 36 ++++++++++++++++++++++++++++++++++++ fetcher/README.md | 40 ++++++++++++++++++++++++++++++++++++++++ import/README.md | 44 ++++++++++++++++++++++++++++++++++++++++++++ 3 files changed, 120 insertions(+) create mode 100644 README.md create mode 100644 fetcher/README.md create mode 100644 import/README.md diff --git a/README.md b/README.md new file mode 100644 index 0000000..8ae7a1c --- /dev/null +++ b/README.md @@ -0,0 +1,36 @@ +Summary +======= + +A set of tools to improve the comfort of using the [Ergast database](http://ergast.com/mrd) - Formula One results database. + +Overview +======== + +The set consists of three tools: + +1. [fetcher](fetcher/README.md) - keeps your copy of the DB dump up to date +2. [import](import/README.md) - allows you to import the database with some tweaks and integrity fixes +3. [views](views/README.md) - a collection of useful or interesting database views, aggregating certain statistics + +Please refer to specific README files for details on the tools. + +Requirements +============ + +These tools were developed with Unix systems in mind and are targeted at users who are not afraid of the command line. If you're semi-fluent in database administration, you'll manage. + +For specific technical requirements, refer to README files of specific tools. + +Author +====== + +If you want to contact me about the application, you can drop me a private message on Reddit: /u/emkael (although I know it's not exactly rocket science to work out other means of communication, from the GitHub account alone). I'll do my best to help. + +License +======= + +lol, idk. + +Do: use it, share it, mix it, modify it, analyse it - just leave some attribution. + +Don't: sell it? diff --git a/fetcher/README.md b/fetcher/README.md new file mode 100644 index 0000000..828e405 --- /dev/null +++ b/fetcher/README.md @@ -0,0 +1,40 @@ +Overview +======== + +fetcher keeps the local copy of Ergast database dump (available from [Ergast website](http://ergast.com/mrd/db)) up to date, using timestamps declared in the HTTP headers. + +The database dump is also slightly altered on unpacking, to get rid of MyISAM table storage engine, hard-coded in the dump. This allows an easy import into other storage engine database, like InnoDB. + +fetcher is part of [ergast-goodies](README.md) toolset. + +Requirements +============ + +Since this bash script simply stitches together some tools used to retrieve and unpack the database dump, it requires these to work properly: + +* pushd/pop for sane working directory management (the script works in its own directory) +* wget +* zcat +* sed +* find +* touch + +Most of these tools are available by default in every Linux installation, the rest of it should be easily accessible from the package repository of your distribution of choice. + +Usage +===== + +Just run `fetcher.sh`. + +The script takes no arguments. + +Under the hood +============== + +The script makes use of wget's timestamping feature. It attempts to download the f1db.sql.gz dump from Ergast website only if the declared timestamp of the file is newer than the local timestamp of the dump file. + +The .sql.gz file is stored in script directory only for that purpose. + +If the script is unable to connect to the remote server, that local copy is used instead. + +If the archive is newer than the unpacked dump, which is stored in script directory as well, it unpacks the dump, erasing the storage engine declaration on the fly. The resulting .sql file is ready for import to the database of your choice. diff --git a/import/README.md b/import/README.md new file mode 100644 index 0000000..7e7b2ab --- /dev/null +++ b/import/README.md @@ -0,0 +1,44 @@ +Overview +======== + +import provides wrapper scripts to import the [Ergast](http://ergast.com/mrd/db) database dump to local database, and apply technical and factual fixes. + +import is part of [ergast-goodies](README.md) toolset. + +Requirements +============ + +* pushd/pop for sane working directory management (the script works in its own directory) +* mysql command-line client +* [fetcher](fetcher/README.md) (optional) + +Usage +===== + + ./00_base_import.sh [MYSQL PARAMETERS] + + ./01_integrity_import.sh [MYSQL PARAMETERS] + + mysql [MYSQL PARAMETERS] < ./10_fixes.sql + + +Under the hood +============== + +**No database credentials or connection parameters are being stored or processed by the utility.** These are just wrappers, proxying all the parameters to `mysql` client invokation. + +Running `00_base_import.sh` imports just the basic, raw Ergast dump. If you already have the database copied locally, you may as well skip it. + +The script looks for f1db.sql dump in [fetcher](fetcher/README.md)'s directory and feeds the `mysql` command with its content, relaying all command line parameters to `mysql`. + +If the DB dump is not present, the script attempts to fetch it via [fetcher](fetcher/README.md). + +The `01_integrity_import.sh` script does the heavy lifting of improving the database under the hood, so it could be used with modern RDBMS techniques. Its main purpose is to create indexes and proper foreign keys constraints in the database. + +All the script does is running all the 01?_*.sql scripts against selected database, in alphabetical order (relying on bash wildcard expansion, [which is supposed to guarantee alphabetical order](http://serverfault.com/questions/122737/in-bash-are-wildcard-expansions-guaranteed-to-be-in-order). The SQL scripts are: + +* `01a_integrity_fixes.sql` - fixing some of the values which would otherwise conflict with creating foreign key constraints (usually temporarily, since these issues are reported to the upstream Ergast maintainers) +* `01b_indexes.sql` - creating indexes on some of the columns (chosen arbirtarily, but hopefully using common sense) +* `01c_constraints.sql` - creating foreign key constraints on all foreign keys in the database + +Provided, there's also the `10_fixes.sql` SQL script. It contains fixes for factual errors in the database which have been [reported upstream](http://ergast.com/mrd/bugs), but are yet to have fixes available in the dump. -- cgit v1.2.3