The de facto languages of Big Data and data science are
Other languages include
In our course, we will be using Scala and Python.
Scala
object Hello extends App {
println("Hello, world")
for (i <- 1 to 10) {
System.out.println("Hello")
}
}
{
and }
Python
#!/usr/bin/env python
for i in range(1, 10):
print "Hello, world"
Scala
val a: Int = 5
val b = 5
b = 6 // re-assignment to val
// Type of foo is infered
val foo = new ImportantClass(...)
var a = "Foo"
a = "Bar"
a = 4 // type mismatch
val
s are single-assignment, var
s are multiple assignmentPython
a = 5
a = "Foo"
a = ImportantClass(...)
Scala
def max(x: Int, y: Int): Int = {
if (x >= y)
x
else
y
}
Python
def max(x,y):
if x >= y:
return x
else:
return y
Scala
def bigger(x: Int, y: Int,
f: (Int,Int) => Boolean) = {
f(x, y)
}
bigger (1, 2, (x, y) => (x < y))
bigger (1, 2, (x, y) => (x > y))
// Compile error
bigger (1, 2, x => x)
Python
def bigger(x, y, f):
return f(x, y)
bigger(1,2, lambda x,y: x > y)
bigger(1,2, lambda x,y: x < y)
# Runtime error
bigger(1,2, lambda x: x)
In both cases, bigger
is a higher-order function, i.e. a function whose behaviour is parametrised by another function. f
a function parameter. To call a HO function, we need to construct a function with the appropriate arguments. The compiler checks this in the case of Scala.
Scala
class Foo(val x: Int,
var y: Double = 0.0)
// Type of a is infered
val a = new Foo(1, 4.0)
println(a.x) //x is read-only
println(a.y) //y is read-write
a.y = 10.0
println(a.y) //y is read-write
a.y = "Foo" // Type mismatch, y is double
val
means a read-only attribute. var
is read-writePython
class Foo():
def __init__(self, x, y):
self.x = x
self.y = y
a = Foo(3,2)
print a.x
a.x = "foo"
print a.x
Scala
class Foo(val x: Int,
var y: Double = 0.0)
class Bar(x: Int, y: Int, z: Int)
extends Foo(x, y)
trait Printable {
val s: String
def asString() : String
}
class Baz(x: Int, y: Int, private z: Int)
extends Foo(x, y)
with Printable
Python
class Foo():
def __init__(self, x, y):
self.x = x
self.y = y
class Bar(Foo):
def __init__(self, x, y, z):
Foo.__init__(self, x, y)
self.z = z
In both cases, the traditional rules of method overriding apply. Traits in Scala are similar to interfaces in Java; in addition, declared methods may be implemented and they can include attributes (state).
case class Address(street: String, number: Int)
case class Person(name: String, address: Address)
val a1 = new Address("Mekelweg", 4)
val p1 = new Person("Georgios", a1)
val p2 = new Person("Georgios", a1)
p1 == p2 // True
Case classes are blueprints for immutable objects. We use them to represent data records. Scala automatically implements hashCode
and equals
for them, so we can compare them directly.
Pattern matching is if..else
on steroids
// Code for demo only, won't complile
value match {
// Match on a value, like if
case 1 => "One"
// Match on the contens of a list
case x :: xs => "The remaining contents are " + xs
// Match on a case class, extract values
case Email(addr, title, _) => s"New email: $title..."
// Match on the type
case xs : List[_] => "This is a list"
// With a pattern guard
case xs : List[Int] if x.head == 5 => "This is a list of integers"
case _ => "This is the default case"
}
This is by far not an introduction to either programming languages. Please read more here
Pick one and become good at it!
This work is (c) 2017 - onwards by TU Delft and Georgios Gousios and licensed under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International license.