https://naes.unr.edu/barrios/wp-content/?done=after-civil-essay-expansion-war-westward essays about divorce and children formula for creating a good thesis statement thesis research paper purpose atleti canottaggio viagra go here healthy and fitness essay follow site preparation of viagra sample of an action research paper overdose viagra amputer get link ielts essay band 9 task 1 see go to site mla paper example compare and contrast business and academic writing free essays on against abortion follow url thesis oral history open book examination essay doxycycline dosage for uti click https://sanctuaryforest.org/prompts/pearl-harbor-thesis-paper/19/ will writing service hitchin free maths coursework creative writing program paris doctor essay for class 1 https://wolverinecrossing.com/how/copyediting-and-proofreading/35/ click https://teamwomenmn.org/formatting/psychology-ap-essays/23/ I’ve been to Spark and back. But I did leave some of my soul.
According to Apache, Spark was developed to “write applications quickly in Java, Scala, Python, R, and SQL”
And I’m sure it’s true. Or at least I’m sure their intentions were noble.
I’m not talking about Scala yet, or Java, those are whole other language. I’m talking about Spark with python. Or PySpark, as the Olgivy inspired geniuses at Apache marketing call it.
The learning curve is not easy my pretties, but luckily for you, I’ve managed to sort out some of the basic ecosystem and how it all operates. Brevity is my goal.
This doesn’t include MLib, or GraphX, or streaming, just the basics
Import some data
train = sqlContext.read.option("header", "true")\ .option("inferSchema", "true")\ .format("csv")\ .load("train_V2.csv")\ .limit(20000)
Show the head of a dataframe
List the columns and their value types
Show a number of rows in a better format
Count the number of rows
List column names
Show mean, medium, st, etc...
Show mean, medium, st, etc... of just one column
Show only certain columns
Get the distinct values of a column
That's it for now...