pyk deep learning & natural language processing

printf, safely.

printf prototype is defined like below:

int printf(const char *format, ...);

The first argument const char *format is a format string that contains placeholders marked by % escape character.

By default, C compiler is doesn’t care if you use printf correctly or not. The following unsafe code will compile succesfully without warning or error:

#include <stdio.h>

int
main()
{
    char *a = "string";
    printf(a);
}
# gcc version 4.9.2
gcc printf_unsafe.c
./a.out
string

The code above seems safe, but it give us unpredictable consequences if a contains placeholder that there are no argument to be formatted. It is possible that it will print a private value from memory.

#include <stdio.h>

int
main()
{
    char *a = "string %d";
    printf(a);
}
# gcc version 4.9.2
gcc printf_unsafe.c
./a.out
string 927983944

So, the correct way to use printf is always define a format string explicitly:

#include <stdio.h>

int
main()
{
    char *a = "string %d";
    printf("%s", a);
}
# gcc version 4.9.2
gcc printf_safe.c
./a.out
string %d

Compiling the unsafe code with -Wformat=2 -Werror flag will prevent you from using printf incorrectly at runtime.

# gcc version 4.9.2
gcc printf_unsafe.c -Wformat=2 -Werror
printf_unsafe.c: In function ‘main’:
printf_unsafe.c:7:2: error: format not a string literal and no format arguments [-Werror=format-security]
  printf(a);
  ^
cc1: all warnings being treated as errors

A Notes about Scala Anonymous Functions

Scala is my first functional programming language. Recently I wrote a lot of codes in Scala to build a Data processing/analytics and Machine Learning application using Apache Spark.

Looking at source code like this:

val lineWithSaleStock = textFile.filter(line => line.contains("Sale Stock"))

My first impression was: “Where the hell is line coming from?”. At previous line, there is no definition of line variable/value at all.

I decided to look at Spark API and found the following filter method definition:

def filter(f: (T) ⇒ Boolean): RDD[T]
    Return a new RDD containing only the elements that satisfy a predicate.

It turns out that filter method took an argument of boolean function. So, this part should be a boolean function then:

line => line.contains("Sale Stock")

Ahh interesting, line is the input parameters and it returns a boolean value from line.contains("Sale Stock") one. And that’s part is a Scala Anonymous Function.

It seems, Scala is have a nice syntax to express the anonymous function. I can express previous function as:

val isContainsSaleStock = (s: String) => s.contains("Sale Stock")
val lineWithSaleStock = textFile.filter(isContainSaleStock)

This post from Scala Doc provides good resources about Scala anonymous function.

Spending time and energy for the right thing

Getting work, school and life balance is very hard to do but it is possible.

For me, the balance is about optimizing my time and energy to be used for the right thing. To be able to do that, I need to know what task that need to be done for tomorrow.

In practice, time and energy is a limited resources. Doing the whole thing at once is a bad idea. So splitting task to smaller sub-task is allow me to have a sense of accomplishment. Also, allocating time for break is allow me to keep my energy stable.

I don’t have work, school and life balance yet. But, I feel on the right track to accomplish that.

Scala on Debian

Just dig deeper on Scala world. I leave a note here on how to setup Scala on debian.

Choose Scala version here, I use Scala 2.12.0-M3 one. Grab the download link of debian package.

http://downloads.lightbend.com/scala/2.12.0-M3/scala-2.12.0-M3.deb

And now, setup using the following command:

wget http://downloads.lightbend.com/scala/2.12.0-M3/scala-2.12.0-M3.deb
sudo dpkg -i scala-2.12.0-M3.deb

Try it, make sure it works

% scala
Welcome to Scala 2.12.0-M3 (Java HotSpot(TM) 64-Bit Server VM, Java 1.8.0_72).
Type in expressions for evaluation. Or try :help.

scala> 2 + 2
res0: Int = 4

Debugging in Rust

Rust programming language comes with traits called Debug as specified in fmt module. We can use this trait to display custom debug information from our struct.

The usage is very straight forward, we just derive debug implementation via #[derive(Debug)] above the struct. For example

#[derive(Debug)]
struct Node {
    index: i32,
    data: &'static str,
}

Then you can use {:?} argument type to request Debug traits from struct Node.

let n1 = Node{index: 1, data: "Node 1"};
println!("Debug: {:?}", n1);
// Debug: Node { index: 1, data: "Node 1" }

Or we can use {:#?} argument type to pretty print the debug information

println!("Pretty: {:#?}", n1); 
// Pretty: Node {
//     index: 1,
//     data: "Node 1"
// }

We can also implement Debug trait ourself, for example instead of Node {index: ..., data: ...} format. Let’s use Node{index, data} format.

The implementation available below

use std::fmt;

struct Node {
    index: i32,
    data: &'static str,
}

impl fmt::Debug for Node {
    fn fmt(&self, f: &mut fmt::Formatter) -> fmt::Result {
        return write!(f, "Node{{{},{:?}}}", self.index, self.data);
    }
}

Then we get our custom debug information

let n1 = Node{index: 1, data: "Node 1"};
println!("Debug: {:?}", n1);
// Debug: Node{1, "Node 1"}