Spark and Scala notes

lun 01 abril 2019

These notes are merely random tricks that I learn in my journey with Scala.

Spark

Spark has transformation functions.

Facts to take into account

  • CombineByKey is preferred over GroupByKey.
  • Dealing with Spark and Hadoop require us to have a in mind that we are working in a distrubuted setting.
  • The -byKey methods operate on PairwiseRDDs.

Scala

Seq and Vector vs BufferList

To add an element to Seq I use this: :+.

apply functions

I liked this one, to get the element i from a Seq object, we write this: seq apply i.

Scope of variables

Be careful, the variable variable is out of the scope!

if (true) {
   var variable = "value"
}
println(variable)

More concepts that I need to learn

  • The yield instruction.
  • The apply functions.
  • Difference between object, class, case class, and trait.
  • Growable and Shrinkable objects.