Skip to main content

Where to get data-sets to practice data science?

Data is the new science. Big data holds the answers. - Pat Gelsinger, CEO

1. Programmable Web 

  • Description:  This is a site where you can obtain API's to extract data from some of the biggest sites on the internet.
  • Link address: API Directory
  • Examples: Google maps API, Instagram API, Twitter API etc.
2. Postman API Development

  • Description:  An online tool that you can use to access millions of APIs on the internet. You can also develop your own API if you happen to own a site.
  • Link address: API TOOL 
  • Examples: Paypal API, Adobe API, Coursera API etc.
3. Facebook graph
  • Description:  An online tool that you can use to access data about Facebook pages.
  • Link address: API 
  • Examples: graph.facebook.com/youtube  - Access page data, e.g likes, number of posts etc.
4. APIGEE
  • Description:  An online GUI tool that lets your extract and send data to various web platforms
  • Link address: GET & POST TO API
  • Examples: Bing API, Github API, Heroku API etc.
5. Kaggle
  • Description:  A data science and machine learning community where you can obtain clean datasets from various sources.
  • Link address: Datasets
  • Examples: Titatnic dataset, South African Crime dataset, Trending youtube video statistics etc.


Comments

Popular posts from this blog

Gunicorn vs NGINX Explained

Web applications typically disseminate information via three server layers: Web server   - First layer: Receives request from client(web browser) Application server - 2nd layer: Receives requests from web server if the client is requesting dynamic content Database - 3rd layer: Receives database queries from web framework's request handler methods In this example , nginx is the web server, gunicorn is the application server (interface between nginx and web framwork request handling function) and the database can be assumed to be a lightweight sqlite3 database. Example: Django architecture Alternatives to nginx : Cherokee Apache HTTP server Alternatives for gunicorn: Apache Mongoose

How to install python 3 in Ubuntu 16.04 / LIve USB

The following guide will help you get started with installing python3 and pip3 on your linux( Ubuntu 16.04 ) live USB.  1. First install python 3 from the "deadsnakes" repository from the Python Package Accesories ( PPA ) $ sudo add-apt-repository ppa:deadsnakes/ppa $ sudo apt-get update $ sudo apt-get install python3.6 This will install both python 3.5.2 and python 2.7.2 on your machine. Verify the installation by running python3 --version or python --version on the linux command line.  2. Next up, install pip3 package manager that will allow you to install other useful python libraries such as numpy,seabon ,pandas etc. $ sudo apt-get update $ sudo apt-get install python3-setuptools $ sudo easy_install3 pip  Once again, verify the installation by running pip3 --version on the command line. 3. You can now use pip3 to install non-core python libraries as follows   $ sudo pip3 install jupyter  $ sudo pip3 numpy etc.

Python packages

A python package is a collection of python module s. These modules are python scripts with *.py file extension. Typically, these scripts will contain python functions, classes, custom data types etc. Looking at the figure above, the Game package consist of an _init_.py file together with 3 other sub-packages Sound, Image and Level respectively. The _init_.py file must be included inside a directory for it to be considered a python package.In addition, the directory must be defined inside sys.path. Example usage: 1. import Game.Level.Start Suppose the start.py module consists of a function called mince() Inside your python application, you would need to call it as follows 2. Game.Level.Start.mince(<input_params>) Alternatively, you can reference it as follows 3. from Game. Level import Start     // recommended approach 4. Start.mince(<input_params>) A less common method is calling the function as if it was defined inside your curr...