Skip to main content

Where to get data-sets to practice data science?

Data is the new science. Big data holds the answers. - Pat Gelsinger, CEO

1. Programmable Web 

  • Description:  This is a site where you can obtain API's to extract data from some of the biggest sites on the internet.
  • Link address: API Directory
  • Examples: Google maps API, Instagram API, Twitter API etc.
2. Postman API Development

  • Description:  An online tool that you can use to access millions of APIs on the internet. You can also develop your own API if you happen to own a site.
  • Link address: API TOOL 
  • Examples: Paypal API, Adobe API, Coursera API etc.
3. Facebook graph
  • Description:  An online tool that you can use to access data about Facebook pages.
  • Link address: API 
  • Examples: graph.facebook.com/youtube  - Access page data, e.g likes, number of posts etc.
4. APIGEE
  • Description:  An online GUI tool that lets your extract and send data to various web platforms
  • Link address: GET & POST TO API
  • Examples: Bing API, Github API, Heroku API etc.
5. Kaggle
  • Description:  A data science and machine learning community where you can obtain clean datasets from various sources.
  • Link address: Datasets
  • Examples: Titatnic dataset, South African Crime dataset, Trending youtube video statistics etc.


Comments

Popular posts from this blog

Gunicorn vs NGINX Explained

Web applications typically disseminate information via three server layers: Web server   - First layer: Receives request from client(web browser) Application server - 2nd layer: Receives requests from web server if the client is requesting dynamic content Database - 3rd layer: Receives database queries from web framework's request handler methods In this example , nginx is the web server, gunicorn is the application server (interface between nginx and web framwork request handling function) and the database can be assumed to be a lightweight sqlite3 database. Example: Django architecture Alternatives to nginx : Cherokee Apache HTTP server Alternatives for gunicorn: Apache Mongoose

How to transfer a gitlab repository into github

Method 1: Use the linux command line 1. Assume you have a gitlab repository called matric2016.git 2. Create a new working directory: $ mkdir myproj && cd myproj $ git clone gitlab@gitlab.com/Banzyme2/matric2016.git $ cd matric2016.git Make sure you create a github repository with the same name as the gitlab repository ,i.e. matric2016  3. Clone your project into github as follows:  $ git remote add github https://github.com/Banzyme/matric2016.git  $  git push --mirror github Method 2: Using the github dashboard repository import 1. Click "+" next to your github profile. Select import repository 2. Fill out the import form  as follows

Too much abstraction

Abstraction refers to the method of 'hiding' implementation details in the process of software development. This technique is useful to prevent code duplication in a computer application. Usually, this is achieved by the use of libraries, packages or modules contained in the programming language core library or from external sources. So what is the issue here....... Too much of anything is umm... bad for you! Most new programmers grow up with little information concerning how most libraries work in the background Cloud computing eliminates the need for  hardware set up and low level machine configurations Modern programmers are mostly concerned with integrating several "black boxes" to create innovative and disruptive solutions - GREAT! "Essentially, we're progressing into an age where no one knows how to operate a manual transmission vehicles" - SAD! Potential issues What happens when the background code in the abstraction "...