{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Assignment on Programming Techniques for Big Data\n", "\n", "Functional programming is the basis of most modern Big Data processing systems.\n", "Before going forward to the course, it is important to master data processing\n", "techniques using a functional programming style. In this assignment, your task \n", "is to train yourselves in thinking in a functional way when processing data. \n", "\n", "Many of the the tasks below are easy, but some are not and require many\n", "iterations (and extensive testing!) to get right. For some of them, you\n", "can find ready-made solutions on the internet. Even though it may be tempting,\n", "I advise you against copying and pasting them here, as you will regret it\n", "later on in the course.\n", "[Wax on, wax off!](https://www.youtube.com/watch?v=fULNUr0rvEc)\n", "\n", "This assignment has a total of *110* points. \n", "\n", "**For each function you implement, you also need to define at least one working example.**" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## `fold`s and friends\n", "\n", "**T**(5+5pts): Implement `reduceL` and `reduceR` using iteration for lists/arrays" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**T** (10pts): Implement `reduceL` by reusing `reduceR`" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**T** (10pts): Implement a function `flatten(xs: [A]): [A]` that converts a list of \n", "lists into a list formed by the elements of these lists. For example:\n", "\n", "```python\n", ">>> a = [[1,2],[2,3],[3,[4]]]\n", ">>> flatten(a)\n", "[1,2,2,3,3,[4]]\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**T** (10pts): You may have noticed that the original `flatten` definition is not\n", "recursive, as it will only flatten 1 level nested lists.\n", "Write a function `deep_flatten(xs: [A]): [A]` that converts a list of \n", "lists into a list formed by the elements of these lists _recursively_. \n", "For example:\n", "\n", "```python\n", ">>> a = [[1,2],[2,3],[3,[4]]]\n", ">>> flatten(a)\n", "[1,2,2,3,3,4]\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**T** (10pts): Implement `group_by` by reusing `reduceL`." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Simple data processing\n", "\n", "From this point forward, use the `reduceL`, `reduceR` or `flatten` \n", "[implementations you created above](https://en.wikipedia.org/wiki/Eating_your_own_dog_food). \n", "\n", "\n", "**T** (5pts): Implement `distinct` using `reduceL`. Write its function signature." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**T** (10pts): Implement a function `reverse` that reverses a list/array. As an \n", "example:\n", "\n", "```\n", ">>> a = [1,2,3,4]\n", ">>> reverse(a)\n", "[4,3,2,1]\n", "```\n", "Also, write down the function signature." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**T** (5pts): Implement a function `mean` that calculates the mean of an list \n", "of integers.\n", "\n", "Also, write down the function signature." ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Higher-order functions\n", "\n", "\n", "**T** (10pts): Implement a function called `drop_while(xs: [A], f: A -> Boolean) : [A]` \n", "that drops the longest prefix of elements from `xs` that satisfy `f`.\n", "\n", "```\n", ">>> a = [1,2,3,4,5,6]\n", ">>> dropWhile(a, lambda x: x <= 3)\n", "[4,5,6]\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**T** (10pts): Implement a function zip(xs: [A], ys: [B]): List[(A,B)] that returns a list formed from this list and another list by combining the corresponding elements in pairs. If one of the two lists is longer \n", "than the other, its remaining elements are ignored.\n", "\n", "```\n", ">>> a = [2,3,4]\n", ">>> b = [a,b,c,d]\n", ">>> zip(a,b)\n", "[[2,a],[3,b],[4,c]]\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**T** (10pts): Implement a function \n", "`scanL(xs: [A], init: B, f: (acc: B, x: A) -> B) -> [B]`\n", "that works like `foldL` but instead of producing one final result, it also\n", "returns all the intermediate results.\n", "\n", "```\n", ">>> a = [2,3,4]\n", ">>> scanL(a, 0, lambda x, y: x + y)\n", "[0, 2, 5, 9]\n", "```" ] }, { "cell_type": "code", "execution_count": null, "metadata": { "collapsed": true }, "outputs": [], "source": [] }, { "cell_type": "markdown", "metadata": { "collapsed": true }, "source": [ "\n", "### Credits\n", "\n", "Many of the textual descriptions of the functions where addapted from\n", "the Scala language documentation, [see here](https://www.scala-lang.org/files/archive/api/2.12.2/)." ] } ], "metadata": { "kernelspec": { "display_name": "Python 3", "language": "python", "name": "python3" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.6.1" } }, "nbformat": 4, "nbformat_minor": 2 }