This is the summary mind note from the Applied Predictive Modeling. This part shows the model building process and details the pre-processing strategies.
JQuery like manipulation
Choose elements and bind the data
- datum()：bind one data to elements
- data()： bind an array to elements
SVG Canvas & Scales & Axis
D3.js uses SVG as canvas to draw.
delay() which is used for a transformation delay.
A list of bult-in layout:
- Bundle - apply Holten’s hierarchical bundling algorithm to edges.
- Chord - produce a chord diagram from a matrix of relationships.
- Cluster - cluster entities into a dendrogram.
- Force - position linked nodes using physical simulation.
- Hierarchy - derive a custom hierarchical layout implementation.
- Histogram - compute the distribution of data using quantized bins.
- Pack - produce a hierarchical layout using recursive circle-packing.
- Partition - recursively partition a node tree into a sunburst or icicle.
- Pie - compute the start and end angles for arcs in a pie or donut chart.
- Stack - compute the baseline for each series in a stacked bar or area chart.
- Tree - position a tree of nodes tidily.
- Treemap - use recursive spatial subdivision to display a tree of nodes.
Apache Spark in a distributed in-memory cluster computing system. Many people including me like to use Spark in python with IPython for a data analysis purpose.
But unfortunately the configuration is always a little bit tricky for the moment.
Steps to follow:
- Download spark and unzip:
wget http://d3kbcqa49mib13.cloudfront.net/spark-1.5.1-bin-hadoop2.6.tgz && tar -zxvf spark-1.5.1-bin-hadoop2.6.tgz
- Configure the global variable
SPARK_HOMEto the unzipped folder, don’t forget to source the .bashrc or .zshrc.
- The installation is simple by using
pip install findspark.
- Get into IPython and play
That’s it, go play with the
I recently purchased a new add-on for Raspberry. With a successful advertisement of sending the Sense HAT to Internation Space Station, now the widget is ready for purchase: link, online shopping, french store.
To begin with the little board, all you have to do is to run the following command in order to install the driver:
A reboot of the raspberry will give you access to the board.
Cannot get sense-hat installed
This problem seems to be related to the operating system. I have
HypriotOS installed, and the os repository does not provide
sense-hat, so I reinstall the original image of
Still not working
Make sure you have reboot your raspberry. Once the Sense HAT shows no LED after reboot, this means it is sucessfully installed.
Make sure you have enabled I2C by
sudo raspi-config, select
8. Advanced Options - I2C - Enable
With provided example, you can already play with the sense HAT.
Here is a quick start with
So I make a deeper step, by showing a heart in order to convience my GF (lol) for the purchase:
Currently, I am working on a project which requires visualize Polygons on the map. The problem that I have encountered is that in order to visualize the result in
leafletjs, I need to convert the shape file to geojson format.
So the solution is to use rgdal package to do the conversion.
The input is the shapefile, transformed to a
sp::SpatialPolygonsDataFrame, and the output is a geojson file.
The ultimate use case is to use shiny to render on live (but this is quite limited to the quantity of polygons), here is a function to do so:
The function returns a string variable with
Geocoding and reverse geocoding are the process of turning the address or place name into geographic coordinates, vice versa.
More detailed definition: wikipedia
Address –> Latitude,Longitude: Geocoding
Latitude,Longitude –> Address: Reverse geocoding
The implementation is done by various providers, here is a list:
So with these services, next step is to use script to access the service’s API.
Application available via StarryLab
By using shiny, we can easily implement the modeling and visualization in a web page.
Before creating the application, a minimum design is required. In the 4th post, we have initialize the first design. Now I have enhance/modify the design as following:
I will go details about some shiny techniques next time. Since some requests of detailing: back-end and front-end communication, shiny programming techniques.
After considering the modeling framework, the next step is to realize the analysis.
As suggested in the previous schemas, in order to create the predictive model to predict daily subscriber/pass user, number of trips, we are going to combine several source of data into one
dataframe. Then running simple
randomForest algorithm and
caret package (machine learning for
Here is the required library:
Recently I discovered a low-cost VPS service. So I decided to create a small lab (not so powerfull) to demonstrate/use the power of R/Python/spark(not yet) toolbox.
I set up an instance
Cloud Sandbox Large with
ubuntu 14.04 which has:
- 1 core
- 4 GB RAM
- 30 GB SSD
- 1 TB Bandwidth*
This is sufficient to run R/python.
Once the instance is set-up, login via
ssh. Before the configuration, update & upgrade the system in first place.
PS. Server configuration & dns redicrection is not documented here
After contructing the hardware, the next step is to install Operating System.
There are quite a lot of choices of Operating Systems for Raspberry Pi. As we can see in the official download page:
- NOOBS: Raspian and possible to net-install other OS
- OPENELEC: Media Center
- OSMC: Media Center
- Raspbmc: another media center
And other personal/3rd party distribution:
- dietRaspbian: minimized
- OpenWrt: a router OS