This project's goal is to answer the following three questions:
-
- What are the most popular three articles of all time?
-
- Who are the most popular article authors of all time?
-
- On which days did more than 1% of requests lead to errors?
This project requires Python3 or greater to run, download the version compatible with your operating system.
Be sure to download and install the appropriate version for your operating system. To confirm that Vagrant has been correctly installed, in your operating systems command prompt type vagrant --version
, if the response is something akin to Vagrant 1.8.5
you can proceed to the next step.
Download Jasons-Analysis-Project
Save the project to a local folder of your choice.
Using a terminal or command prompt, navigate to the project directory and type vagrant up
to start preparing the vagrant environment. It may take a few minutes depending on internet speed etc. When the process has completed, run vagrant ssh
and you will be logged in to your VM.
Download newsdata.sql
Within the .zip
file, you specifically need newsdata.sql
. newsdata.sql
is the database that this project will query information from. Download it and place it in the vagrant directory that's shared with your newly started VM, specifically place the file here ~/Jasons-Analysis-Project/FSND-Virtual-Machine/vagrant/newsdata.sql
in the same folder as jasons_analysis_project.py
.
Within your running Vagrant VM, run the command:
psql -d news -f newsdata.sql;
Note This project has been designed to automatically create or replace all views necessary, direct interaction with the psql
interface should be unnecessary.
Regardless, here are the following views that are created by the project upon start:
top_slugs_view
CREATE OR REPLACE VIEW top_slugs_view AS SELECT path,
COUNT(*) AS num_views FROM log
WHERE status = '200 OK'
AND NOT path = '/'
GROUP BY path ORDER BY num_views DESC;
alt_view
CREATE OR REPLACE VIEW alt_view AS SELECT author, num_views
FROM articles, top_slugs_view
WHERE top_slugs_view.path = CONCAT('/article/', slug)
ORDER BY num_views DESC;
x_view
CREATE OR REPLACE VIEW x_view AS SELECT author, sum(num_views)
FROM alt_view
GROUP BY author;
errors_day_view
CREATE OR REPLACE VIEW errors_day_view AS SELECT
DATE(time), COUNT(*) AS num_views
FROM log
WHERE status = '404 NOT FOUND'
GROUP BY status, DATE
ORDER BY num_views DESC;
hits_day_view
CREATE OR REPLACE VIEW hits_day_view AS SELECT
DATE(time), COUNT(*) AS num_views
FROM log GROUP BY DATE(time);
From within your running Vagrant VM, ensure you're in the vagrant
folder that contains jasons_analysis_project.py
and type the command python3 jasons_analysis_project.py
and press enter
.
You should now see something like:
-------------------------------------------------
Welcome to Jason's Python Database Query Machine!
-------------------------------------------------
1) Top 3 Articles
2) Top Authors
3) Days With Greatest Percent Request Errors
4) Exit
Select an option 1-4:
Enter a number 1
through 3
to query information from the database or enter 4
to exit the project.