The de facto languages of Big Data and data science are
Other languages include
In our course, we will be using Scala and Python.
Scala
object Hello extends App {
println("Hello, world")
for (i <- 1 to 10) {
System.out.println("Hello")
}
}{ and }Python
#!/usr/bin/env python3
for i in range(1, 10):
print("Hello, world")Scala
val a: Int = 5
val b = 5
b = 6 // re-assignment to val
// Type of foo is infered
val foo = new ImportantClass(...)
var a = "Foo"
a = "Bar"
a = 4 // type mismatchvals are single-assignment,
vars are multiple assignmentPython
a : int = 5
a = "Foo"
a = ImportantClass(...)Scala
def max(x: Int, y: Int): Int =
if (x >= y) x else yPython
def max(x : int, y : int) -> int:
if x >= y:
return x
else:
return yScala
def bigger(x: Int, y: Int,
f: (Int,Int) => Boolean) =
f(x, y)
bigger (1, 2, (x, y) => (x < y))
bigger (1, 2, (x, y) => (x > y))
// Compile error
bigger (1, 2, x => x)Python
def bigger(x, y, f):
return f(x, y)
bigger(1,2, lambda x,y: x > y)
bigger(1,2, lambda x,y: x < y)
# Runtime error
bigger(1,2, lambda x: x)bigger is a higher-order function, i.e. a function whose
behaviour is parametrised by another function. f a function
parameter. To call a HO function, we need to pass a function with the
appropriate argument types. The compiler checks this in the case of
Scala.
Scala
class Foo(val x: Int,
var y: Double = 0.0)
// Type of a is infered
val a = new Foo(1, 4.0)
println(a.x) //x is read-only
println(a.y) //y is read-write
a.y = 10.0
println(a.y) //y is read-write
a.y = "Foo" // Type mismatch, y is doubleval means a read-only attribute. var is
read-writePython
class Foo():
def __init__(self, x, y):
self.x = x
self.y = y
a = Foo(3,2)
print a.x
a.x = "foo"
print a.xScala
class Foo(val x: Int,
var y: Double = 0.0)
class Bar(x: Int, y: Int, z: Int)
extends Foo(x, y)
trait Printable {
val s: String
def asString() : String
}
class Baz(x: Int, y: Double, private z: Int)
extends Foo(x, y)
with PrintablePython
class Foo():
def __init__(self, x, y):
self.x = x
self.y = y
class Bar(Foo):
def __init__(self, x, y, z):
Foo.__init__(self, x, y)
self.z = zIn both cases, the traditional rules of method overriding apply. Traits in Scala are similar to default interfaces in Java > 9; in addition, they can include attributes (state).
Scala
case class Address(street: String,
number: Int)
case class Person(name: String,
address: Address)
val p = new Person("G",
new Address("a", 2))Python >= 3.7
from dataclasses import dataclass
@dataclass
class Address:
street: str
number: int
@dataclass
class Person:
name: str
addr: Address
p = Person("G", new Address("a", 2))Data classes are blueprints for immutable objects. We use them to
represent data records. Both languages implement equals (or
__eq__) for them, so we can objects directly.
Pattern matching is if..else on steroids
// Code for demo only, won't complile
value match {
// Match on a value, like if
case 1 => "One"
// Match on the contens of a list
case x :: xs => "The remaining contents are " + xs
// Match on a case class, extract values
case Email(addr, title, _) => s"New email: $title..."
// Match on the type
case xs : List[_] => "This is a list"
// With a pattern guard
case xs : List[Int] if xs.head == 5 => "This is a list of integers"
case _ => "This is the default case"
}This is by far not an introduction to either programming languages. Please read more here
This work is
(c) 2017, 2018, 2019, 2020 - onwards by TU Delft and Georgios Gousios
and licensed under the Creative
Commons Attribution-NonCommercial-ShareAlike 4.0 International
license.