The de facto languages of Big Data and data science are
Other languages include
In our course, we will be using Scala and Python.
Scala
object Hello extends App {
println("Hello, world")
for (i <- 1 to 10) {
System.out.println("Hello")
}
}{ and }Python
#!/usr/bin/env python
for i in range(1, 10):
print "Hello, world"Scala
val a: Int = 5
val b = 5
b = 6 // re-assignment to val
// Type of foo is infered
val foo = new ImportantClass(...)
var a = "Foo"
a = "Bar"
a = 4 // type mismatchvals are single-assignment, vars are multiple assignmentPython
a = 5
a = "Foo"
a = ImportantClass(...)Scala
def max(x: Int, y: Int): Int = {
if (x >= y)
x
else
y
}Python
def max(x,y):
if x >= y:
return x
else:
return yScala
def bigger(x: Int, y: Int,
f: (Int,Int) => Boolean) = {
f(x, y)
}
bigger (1, 2, (x, y) => (x < y))
bigger (1, 2, (x, y) => (x > y))
// Compile error
bigger (1, 2, x => x)Python
def bigger(x, y, f):
return f(x, y)
bigger(1,2, lambda x,y: x > y)
bigger(1,2, lambda x,y: x < y)
# Runtime error
bigger(1,2, lambda x: x)In both cases, bigger is a higher-order function, i.e. a function whose behaviour is parametrised by another function. f a function parameter. To call a HO function, we need to construct a function with the appropriate arguments. The compiler checks this in the case of Scala.
Scala
class Foo(val x: Int,
var y: Double = 0.0)
// Type of a is infered
val a = new Foo(1, 4.0)
println(a.x) //x is read-only
println(a.y) //y is read-write
a.y = 10.0
println(a.y) //y is read-write
a.y = "Foo" // Type mismatch, y is doubleval means a read-only attribute. var is read-writePython
class Foo():
def __init__(self, x, y):
self.x = x
self.y = y
a = Foo(3,2)
print a.x
a.x = "foo"
print a.xScala
class Foo(val x: Int,
var y: Double = 0.0)
class Bar(x: Int, y: Int, z: Int)
extends Foo(x, y)
trait Printable {
val s: String
def asString() : String
}
class Baz(x: Int, y: Double, private z: Int)
extends Foo(x, y)
with PrintablePython
class Foo():
def __init__(self, x, y):
self.x = x
self.y = y
class Bar(Foo):
def __init__(self, x, y, z):
Foo.__init__(self, x, y)
self.z = zIn both cases, the traditional rules of method overriding apply. Traits in Scala are similar to interfaces in Java; in addition, declared methods may be implemented and they can include attributes (state).
case class Address(street: String, number: Int)
case class Person(name: String, address: Address)
val a1 = new Address("Mekelweg", 4)
val p1 = new Person("Georgios", a1)
val p2 = new Person("Georgios", a1)
p1 == p2 // TrueCase classes are blueprints for immutable objects. We use them to represent data records. Scala automatically implements hashCode and equals for them, so we can compare them directly.
Pattern matching is if..else on steroids
// Code for demo only, won't complile
value match {
// Match on a value, like if
case 1 => "One"
// Match on the contens of a list
case x :: xs => "The remaining contents are " + xs
// Match on a case class, extract values
case Email(addr, title, _) => s"New email: $title..."
// Match on the type
case xs : List[_] => "This is a list"
// With a pattern guard
case xs : List[Int] if xs.head == 5 => "This is a list of integers"
case _ => "This is the default case"
}This is by far not an introduction to either programming languages. Please read more here
Pick one and become good at it!
This work is (c) 2017, 2018, 2019 - onwards by TU Delft and Georgios Gousios and licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.