So Tom, Dick & Harry showed up for an interview at the new grocery delivery startup.

There was a whiteboard, a laptop & a notebook, so they could use whatever makes them comfortable.

-------

Tom's interview: The chief data scientist John Doe (JD) walked in.

JD. Lets talk about groceries.

Tom. Ok

JD. So you walk into a grocery store with a grocery bag and some cash, to buy groceries for a week.

Now, a couple problems -

1. your bag can only hold ten pounds.

2. You only have $100

3. You need about 2000 calories a day, so a weekly shopping trip is about 14,000 calories.

So how do you model this grocery problem and maybe find some groceries to take home ?

Tom. So this seems very open ended. Can you give me specific examples ?

JD. Ok, so a pound of ham has 650 calories. Similarly,

--- Calories Per Pound ---

Ham, 650 cals,

Lettuce, 70 cals

Cheese, 1670 cals

Tuna, 830 cals

Bread, 1300 cals

----

Also, a pound of ham costs $4. Similarly,

---- Price Per Pound ----

Ham, $4

Lettuce, $1.5

Cheese, $5

Tuna, $20

Bread, $1.20

----

Tom thought for a while.

Then he grabbed the laptop, opened up his favorite edior & said: First I will write some functions.

There was a whiteboard, a laptop & a notebook, so they could use whatever makes them comfortable.

-------

Tom's interview: The chief data scientist John Doe (JD) walked in.

JD. Lets talk about groceries.

Tom. Ok

JD. So you walk into a grocery store with a grocery bag and some cash, to buy groceries for a week.

Now, a couple problems -

1. your bag can only hold ten pounds.

2. You only have $100

3. You need about 2000 calories a day, so a weekly shopping trip is about 14,000 calories.

So how do you model this grocery problem and maybe find some groceries to take home ?

Tom. So this seems very open ended. Can you give me specific examples ?

JD. Ok, so a pound of ham has 650 calories. Similarly,

--- Calories Per Pound ---

Ham, 650 cals,

Lettuce, 70 cals

Cheese, 1670 cals

Tuna, 830 cals

Bread, 1300 cals

----

Also, a pound of ham costs $4. Similarly,

---- Price Per Pound ----

Ham, $4

Lettuce, $1.5

Cheese, $5

Tuna, $20

Bread, $1.20

----

Tom thought for a while.

Then he grabbed the laptop, opened up his favorite edior & said: First I will write some functions.

JD: Very good! So how many solutions could you find ?

Tom:

scala> solution.size

res14: Int = 299

JD: So of the 299 solutions, which is the most expensive way to shop ?

Tom: That's easy!

scala> val mostExpensive = solution.maxBy{ case (ham,lettuce,cheese,tuna,bread, weight, cost, calories) => cost }

scala> println(mkStr(mostExpensive))

Ham 0.20 Lettuce 0.20 Cheese 7.20 Tuna 2.00 Bread 0.20 Weight 9.80 Cost 77.34 Calories 14088.00

JD: Great! So you end up spending $77

Can you tell me what's the least fatty way to eat.

Tom: Hmm.. I guess I would cut down on the cheese,tuna & ham.

scala> val leastFatty = solution.minBy{ case (ham,lettuce,cheese,tuna,bread, weight, cost, calories) => cheese+tuna+ham }

scala> println(mkStr(leastFatty))

Ham 0.20 Lettuce 0.20 Cheese 4.80 Tuna 0.20 Bread 4.40 Weight 9.80 Cost 34.38 Calories 14046.00

So that gives you 4.4 pounds of bread with about the same amount of cheese & only costs $35. Interesting!

Ok, so Tom.

Here's what I like about you.

You came in, you had absolutely no idea what I was going to ask.

I posed a problem.

You wrote a little bit of code & got actual results.

Your results were correct & interesting & you didn't fumble around with syntax or anything.

In about 20 minutes with some 20 lines of code, you actually have something interesting!

Tom: Thanks! I code every single day, so coding is never a problem.

This is just a for-comprehension, so it wasn't too hard.

JD: You code every single day ?

Tom: Yes. Always Be Coding.

JD: When do you get time to learn ? You know, math, machine learning, statistics, these things take a large block of free time to learn. If you are always coding, when do you learn new material ?

Tom: Uhhh... well, I am a self-taught developer. I learn by writing code. If I need to learn something, I will write some code. That's how I learn.

JD: So you can't pick up a math textbook & solve a few problems with paper & pen...

Tom: What is this, high school ? No, I generally write code.

JD: Sometimes we need to research. Read CS papers, mine the literature, most of that is just reading through the math, working out proofs

Tom: That's not a right fit for me personally. I like to code. If you give me a well-specified algorithm, I can code it up & make it efficient.

JD: Hmmm...ok. Say instead of ham,lettuce,bread,cheese,tuna, you know, instead of 5 items, you had a real grocery shop. Very conservatively, 1000 items.

Tom: ok

JD: OK ? How would you proceed in that case ?

Tom: What's wrong with my for-comp ?

JD: Well, you are brute-forcing the solution, that works well for 5 items but not 1000.

Tom: So it will just take a long time.

JD: That isn't a problem ?

Tom: No, we'll like, use map-reduce or something. Parallelize it.

JD: Are you seriously suggesting you can solve the brute-force version of this problem for 1000 items in some reasonable amount of time ?

Tom: You can't ? Why not ? Processors are quite fast.

JD: I don't think you quite understand. When you pick the ham, going from 0.2 lb to 10lbs, 0.2 at a time, how many choices do you make ?

Tom: Uhhh...

scala> (10-0.2)/0.2

res16: Double = 49.0

JD: So including the initial 0.2, 50 choices.

Tom: OK, so ?

JD: So with 50 choices per item, for 5 items, how many choices do you have ?

Tom: Hmmm...

scala> printf("%.2f", math.pow(50,5))

312500000.00

JD: See ? 300 million!

Tom: I see. So with 1000 items I'd have like...

scala> printf("%.2f", math.pow(50,1000))

Infinity

JD: Ha ha ha ha ha ha ha ha!

Tom: Oops! Did not see that coming. It actually said Infinity on the REPL!

JD: Ha ha ha! Yeah. So what will you do in this case ? Ha ha ha!

Tom: I don't know man. There's probably some fancy math to solve it. I see where you are going. But its not my cup of tea. I ship code. I'm not an R&D guy. I don't read papers or do math. Thank you.

JD: Thank you Tom. We'll ...uhh... get back to you. Ha ha ha ha!

Tom: Yeah whatever...

------

Dick's interview:

The chief data scientist John Doe (JD) walked in.

JD. Lets talk about groceries.

Dick. Why ?

JD. Uhhh...that's like what we do here. We are a grocery startup. So you walk into a grocery store with a grocery bag and some cash, to buy groceries for a week.

Now, a couple problems -

1. your bag can only hold ten pounds.

2. You only have $100

3. You need about 2000 calories a day, so a weekly shopping trip is about 14,000 calories.

So how do you model this grocery problem and maybe find some groceries to take home ?

Dick: I don't generally engage in algorithm pissing contests.

JD: I don't understand. These are the sort of problems we work on in this company...

Dick: Yeah, but lets talk about the real deal. Why am I here ?

JD: Real deal ?

Dick: You know, Hadoop. Data volume. Cluster size. Number of mappers & reducers. Setting up the data pipeline. You know, Thrift vs Protobufs. Tooling. Telemetry. Build systems. The dev process here. How do you guys write unit tests ? What's your logging system ? How do you do A/B tests ? Building Analytics Dashboards. Shipping. Code Reviews. You know, lets talk real life. Not some stupid grocery algorithm.

JD: Uh...Uhhh...You don't believe writing a little bit of code is a good idea ? Even pseudocode will do.

Dick: Dude. Obviously there is some magic function that will spit out the right amount of groceries or whatever. It isn't something I can come up with in the span of this interview. It will take several man-months of coding and lots of unit tests to perfect.

JD: No, no, I assure you we can solve a toy version in this interview.

Dick: I don't play with toys. I work on distributed systems in 100+ node clusters. I'm a Cloudera certified data scientist. I've given keynotes at 3 different Big Data conferences...

JD: Ok Dick, we'll keep in touch. Thank you.

Dick: ???!!

-----------------

Harry's interview.

The chief data scientist John Doe (JD) walked in.

JD. Lets talk about groceries.

Harry. Suuuuuuure.. I though this was data-analytics work.

JD. Yes, it is. We are a big data grocery startup. So you walk into a grocery store with a grocery bag and some cash, to buy groceries for a week.

Now, a couple problems -

1. your bag can only hold ten pounds.

2. You only have $100

3. You need about 2000 calories a day, so a weekly shopping trip is about 14,000 calories.

So how do you model this grocery problem and maybe find some groceries to take home ?

Harry: (stares blankly at the wall)

JD: Do you want me to repeat the problem ?

Harry: I would like some data so I can get an idea...

JD: Sure!

JD then wrote the caloric & pricing data below on the whiteboard.

--- Calories Per Pound ---

Ham, 650 cals,

Lettuce, 70 cals

Cheese, 1670 cals

Tuna, 830 cals

Bread, 1300 cals

----

---- Price Per Pound ----

Ham, $4

Lettuce, $1.5

Cheese, $5

Tuna, $20

Bread, $1.20

----

Harry thought long & hard & stared at the board.

He picked up the notebook & took out a pencil from behind his ear and wrote some equations.

Finally, he showed JD his work -

X = (h l c t b ...) = 1xn vector

A = [ (1,650,4), (1,70,1.5), (1,1670,5),...] = nx3 matrix

B = (10 14000 100) = 1x3 vector

Solve AX = B

Now perturb A & B so that you get a solution vector at every single price point and weight.

JD: What's n ?

Harry: Obviously I'm assuming there will be you know, 1000s of items in the real-life grocery store. n is the number of grocery items.

JD: Ok, and why do I need to perturb A or B ?

Harry: Well, grocery prices will change every day. Sometimes even on the same day I've noticed people charge you more during the peak shopping hours. So the price vector in A changes.

JD: What about B ?

Harry: Well, this matrix setup asssumes exact solution ie. the weights add up to 10 pounds exactly and the prices add up to $100 exactly. In reality, the prices must be anything below $100 and the weights add up to anything below 10 pounds. So you are going to have a whole range of B's.

JD: Hmmm....

Harry: You can also have different B's for different shoppers. Some shoppers might have a 5 pound shopping bag and eat 3000 calories per day & might have $200 to spend on groceries. So (5 21000 200), for example.

JD: Very Nice!

Harry: You can personalize the B matrix on a per-shopper basis, perturb A to reflect the latest prices every hour or so, and run the solver pretty much nonstop 24x7.

JD: What solver ?

Harry: We'd need to buy/build an industrial strength QP solver. You know, a CPLEX or a GAMS. Or we could implement one inhouse using libraries like NAG. Ofcourse, for some trivial cases with say just 10 grocery items, I can code up something in Matlab.

JD: Wow! Sounds like you've done a lot of programming in your career on QPs.

Harry: Not really. I don't like to program.

JD: What ?

Harry: I mostly read. I read a lot of ML literature. I have several papers in various stages of preparation.

JD: So you never code ?

Harry: Well, I write some R once in a while. To prove some results. Sometimes I use Matlab to speed things up. I don't, like, write software, or ship or use github or whatever.

JD: So you don't know git

Harry: I know the name.

JD: Ha ha!

Harry: Well, I generally don't use a computer. I'm mostly a paper-pencil guy. Sometimes one needs to write code, and I get that. But its not my favorite part of the job.

JD: You understand that this is a programming role, though we do a lot of data science.

Harry: Yes, but data scientists aren't necessarily programmers. I don't particularly like to write code.

JD: Hmmm...

------------

So after talking with the CEO, the company made a very generous offer to Harry.

The company also asked Tom if he'd consider interning for a few months.

Negotiations are in progress...