Now, Ad execs aren't imbeciles. With an average 8 figure net worth, they are very market savvy & can sell ice to the eskimo. Yet, posing blue collar math questions & expecting answers also rooted in blue collar math is one of their key failings.

Its not exactly their fault - the SERP ads business is very much blue collar math.

You search for a keyword on a Google or a Bing. Say you type in "red shoes".

Say Macys bids on a red shoes ad, alongwith Nordstrom and Nike and Adidas and all the other shoe guys.

Say Macys' bid wins the ad auction.

Voila! Macys red shoes ad shows up on your search results page.

For this privilege, Macys pays a cpc i.e. a cost per click.

So if the cpc = 10 cents, and there were 20 clicks, Macys would have a cost of 200 cents i.e. $2.

Basic multiplication.

Blue collar math.

If 3 of those 20 clicks resulted in someone purchasing that red shoe, that's a conversion rate of 0.15.

Grade school Division.

Blue collar math.

Suppose each purchase results in two dollars in revenue, that's $6 in revenue right there.

All very blue collar, yeah ? So we've established that execs grasp blue collar math very well.

However, put a bunch of these search keywords in a bag - your red shoes, your blue skirts, pink ties and grey coats and red blouses and satin pantyhoses and million other items of apparel the modern American consumer can seldom do without. Call this bag a portfolio & stick a dollar amount on it.

Say, $10000.

So that there is your daily budget, to spend on these search keywords in the portfolio,

How best to bid on each keyword in order to maximize the portfolio revenue whilst constrained by budget ?

That's a blue collar problem, but not one with a blue collar solution.

When I solve this problem, I use an LSTM to model the conversion rate, an LMA fitter to model the highly nonlinear bid-cpc relationship, a Brent solver to optimize the portfolio by computing Lagrange multipliers, and a few nitty gritties I'd rather not go into 'cause the competition in the adtech space - its fierce.

So when the ad execs say "Can you quickly walk us through how exactly you compute the optimal bid for each keyword" - see, this quick walk - its not exactly a walk in the park, yeah ? Cause its blue dollar math no more. Legit white collar territory.

Now I could indulge them with some intuition, but no, they make it amply clear - They don't want to be indulged.

They want "the exact math".

The "exact nature of the computation"

How "exactly" do I compute the optimal bid ?

Surely, they opine - it can't be so hard. SERP ads are blue collar territory. "There's got to be a simple solution" !

Now, having opinions is easy. We all have them.

So these execs think the underlying math for optimized portfolio bidding must be darned simple.

That's their opinion.

Lets run with that opinion.

I bravely mount a frontal assault on the problem, explaining each piece of the puzzle best I could. Soon I realize I have no chance. These are folk who don't know what a Jacobian is. Don't know Gradient Descent. Don't know Backpropogation. Don't know time series. Hell, don't know what a non-convex function is . Don't even know what slope means.

Further, they are simply unwilling to do the work required!

I have no issues if one doesn't understand something. We were all there once. Every one of us.

However, if one is simply unwilling to do the work required, and yet demands understanding on a platter, that's a huge problem.

Charlie Munger often talks about the work required to have an opinion.

Entertaining an opinion without doing the work required to justify the opinion is akin to letting a stranger occupy your apartment rent-free.

Say you had some free space, and you'd like to rent it out.

You find prospective tenants, call up references, look up employment history, finally pick somebody reliable to live in that space.

Why don't you just walk outside and ask a random stranger to park his ass rent-free in your apartment for as long as he wants ?

Because space is dear.

Yet, one happily harbors all sorts of incorrect notions on every topic under the sun without willing to do the work required.

I realize these execs are seeking a dramatic flash of insight into a complicated problem - one beyond their current levels of intellect.

They want an aha! moment. Something that tells them - "I get it!"

However, that demands 3-4 semesters of undergrad math, not something compressible into a half hour.

Well, Munger also said its far simpler to avoid stupidity than seek brilliance.

So I steer them towards a non-brilliant problem.

So they don't have to be brilliant - lets see if they can just avoid being stupid.

I addressed the execs - Lets assume we're solving this problem way back in the 1990s, yeah ? So, we don't use neural nets, no time series, very straightforward blue collar math.

Say you have a portfolio of 5 keywords. Not the typical 10,000 keywords. Just 5.

5 lousy keywords i.e. { red shoe, blue pant, green skirt , pink tie, satin pantyhose }, sitting in a portfolio, with a $1000 daily budget.

So,

n = 5, daily budget = $1000

Let

cpc = sqrt(bid)

clicks = sqrt(cpc)

conv = 0.01 * i * click, for i = 1 to 5

i.e. the first keyword has a 1% conversion rate, the second 2%, the third 3%, the fourth 4% and the fifth has a 5% conversion rate.

rev = i * 100 * conv, for i = 1 to 5

ie. the first keyword leads to $100 in revenue per conversion, the secind $200, the third $300 etc.

Now, find the maximum revenue this portfolio generates & I'll personally give you $1000 right now, out of my pocket!

Who doesn't like a challenge ?

So the execs quickly crack open their laptop, and I step out for a coffee.

"Hey, where are you going ? We'll be done in a minute!"

Yeah sure!

I'm back in a half hour.

They are still going at it.

I glance at one of the laptops.

This exec has Microsoft Excel going, with columns for bid, cpc, click etc.

His rows are filled with lots of numbers & the total budget doesn't seem to add up.

Another exec must have been a programmer in his past life.

He has written a series of nested for-loops, 5 levels deep, that scan every bid from $0 to $100 in 5 cent increments!

The program seems to be either buggy or too slow, for he doesn't yet have a solution.

The third exec has given up. His notepad is full of calculations scratched out.

"How do you solve the darned thing ?!", he exclaims.

"Well, you have a kid in college, yeah ?" I ask.

"Sophomore", he says.

"I bet your kid could solve this under 5 minutes."

Now he's really pissed.

"You are telling me my kid can solve this problem, but I can't ?"

Yes!

"How come ?"

"Its just straigtforward Calculus. Its like, really really basic calc".

"Hey, I know calculus, I went to college too!"

"Yeah, but when was the last time you used calculus ? You don't know what the slope of a function is, how the heck are you going to optimize bidding ?!"

So I solve the problem on the whiteboard.

For the first keyword,

rev1 = 100 * conv1 = 100 * 0.01 * click1 = sqrt(cpc1)

cost1 = cpc1 * click1 = cpc1 * sqrt(cpc1) = cpc1 ^ (3/2)

Keyword #2:

rev2 = 200 * conv2 = 200 * 0.02 * click2 = 4*sqrt(cpc2)

cost2 = cpc2 * click2 = cpc2 * sqrt(cpc2) = cpc2 ^ (3/2)

Keyword #3:

rev3 = 300 * conv3 = 300 * 0.03 * click3 = 9*sqrt(cpc3)

cost3 = cpc3 * click3 = cpc3 * sqrt(cpc3) = cpc3 ^ (3/2)

Keyword #4:

rev4 = 400 * conv4 = 400 * 0.04 * click4 = 16*sqrt(cpc4)

cost4 = cpc4 * click4 = cpc4 * sqrt(cpc4) = cpc4 ^ (3/2)

Keyword #5:

rev5 = 500 * conv5 = 500 * 0.05 * click5 = 25*sqrt(cpc5)

cost5 = cpc5 * click5 = cpc5 * sqrt(cpc5) = cpc5 ^ (3/2)

So total revenue = sqrt(cpc1) + 4*sqrt(cpc2) + 9*sqrt(cpc3) + 16*sqrt(cpc4) + 25*sqrt(cpc5)

total cost = cpc1 ^ (3/2) + cpc2 ^ (3/2) + cpc3 ^ (3/2) + cpc4 ^ (3/2) + cpc5 ^ (3/2)

Your lagrangian then becomes L = total rev + lagrange multiplier lambda * (budget - total cost)

L = sqrt(cpc1) + 4*sqrt(cpc2) + 9*sqrt(cpc3) + 16*sqrt(cpc4) + 25*sqrt(cpc5) + lambda * (1000 - (cpc1 ^ (3/2) + cpc2 ^ (3/2) + cpc3 ^ (3/2) + cpc4 ^ (3/2) + cpc5 ^ (3/2)))

Find the gradients w.r.t. each variate and set to zero.

dL/dcpc1 = 1/(2*sqrt(cpc1)) - lambda * 3/2 * sqrt(cpc1) = 0

A little bit of algebra yields

6 * lambda * cpc1 = 2

The other partials are just as simple:

6 * lambda * cpc2 = 8

6 * lambda * cpc3 = 18

6 * lambda * cpc4 = 32

6 * lambda * cpc5 = 50

Stare at this mess until you realize the obvious:

cpc2 = 4 * cpc1

cpc3 = 9 * cpc1

cpc4 = 16 * cpc1

cpc5 = 25 * cpc1

Good!

Since cpc = sqrt(bid) and the square root is a monotonically increasing function,

we've also learnt that we should bid much more if it converts faster.

Since keyword2 i.e. the blue pant, has four times the cpc of a red shoe, we bid 16x for the pant.

Not clear ?

Say cpc = sqrt(bid)

Then 4*cpc = 4*sqrt(bid) = sqrt(16*bid) => bid2 = 16*bid1

All well & good. We have a relationship between bids! But how much precisely do we bid ?

Ah, that's easy - because we have one more partial at our disposal.

dL/dlambda = (1000 - (cpc1 ^ (3/2) + cpc2 ^ (3/2) + cpc3 ^ (3/2) + cpc4 ^ (3/2) + cpc5 ^ (3/2))) = 0

Given the cpc relationships, this immediately evaluates to

(cpc1 ^ (3/2)) * (1 ^ (3/2) + 4 ^ (3/2) + 9 ^ (3/2) + 16 ^ (3/2) + 25 ^ (3/2)) = 1000

All done!

Now, if you really want to know how much that works out to in dollars & cents, fish out your REPL calculator:

scala> math.pow(1000/(1 to 5).map{x=>math.pow(x*x,3.0/2.0)}.sum, 2.0/3.0)

res: Double = 2.703200886921511

So cpc1 = $2.7 , cpc2 is 4x that, cpc3 9x, & so forth.

You can verify that the costs add up to exactly $1000

To find the optimal bids, simply square the cpcs ie. bid1 = cpc1^2 = $7.3, bid2 = cpc2^2 etc.

The max. revenue the portfolio generates =

sqrt(cpc1) + 4*sqrt(cpc2) + 9*sqrt(cpc3) + 16*sqrt(cpc4) + 25*sqrt(cpc5) =

sqrt(cpc1) + 4*sqrt(4*cpc1) + 9*sqrt(9*cpc1) + 16*sqrt(16*cpc1) + 25*sqrt(25*cpc1) =

sqrt(cpc1) * ( 1 + 8 + 27 + 64 + 125 )

scala> math.sqrt(2.7) * ( 1 + 8 + 27 + 64 + 125 )

res: Double = 369.7

So, about $370

Now, try as you might, you can't do any better!

You bid more, you blow the budget !

You bid less, your revenue drops !!

So there's your unique solution !!!

The execs take a minute to digest all this math.

Wow! says one.

OK !! thumps another.

Hold on - says the third.

So you spend $1000 and make only $369 - that's a loss.

Sure. In this artificial problem, with 5 keywords modeled by very simple curves, you lose money.

In reality, there are 1000s of keywords with sophisticated functions fitting them.

So you make a healthy profit.

"So is this what you do ? Is this our optimization model ?"

"Ummm. not quite. There are several problems with this model."

"Problems ? After all this math ?"

Yeah.

The conversion rate per keyword isn't a constant, like this model implies.

It varies. Sometimes, a lot.

The revenue per click varies. A lot.

The cpcs & clicks aren't simple square root functions - they are quite complicated non-convex forms.

So I employ LMA curvefitters & time series & neural nets....but that complicates the math 100x, which is why I won't go into all that now.

So yeah, you may claim to run a blue collar business.

But underneath all that macho blue exterior is some fairly polished white collar math, that makes the whole thing tick.

Shrug.

**For it is a well known adage that blue collar math can pose many a question which even white collar math struggles to answer.****The SERP ads business**Its not exactly their fault - the SERP ads business is very much blue collar math.

You search for a keyword on a Google or a Bing. Say you type in "red shoes".

Say Macys bids on a red shoes ad, alongwith Nordstrom and Nike and Adidas and all the other shoe guys.

Say Macys' bid wins the ad auction.

Voila! Macys red shoes ad shows up on your search results page.

For this privilege, Macys pays a cpc i.e. a cost per click.

So if the cpc = 10 cents, and there were 20 clicks, Macys would have a cost of 200 cents i.e. $2.

Basic multiplication.

Blue collar math.

If 3 of those 20 clicks resulted in someone purchasing that red shoe, that's a conversion rate of 0.15.

Grade school Division.

Blue collar math.

Suppose each purchase results in two dollars in revenue, that's $6 in revenue right there.

All very blue collar, yeah ? So we've established that execs grasp blue collar math very well.

**The Problem**However, put a bunch of these search keywords in a bag - your red shoes, your blue skirts, pink ties and grey coats and red blouses and satin pantyhoses and million other items of apparel the modern American consumer can seldom do without. Call this bag a portfolio & stick a dollar amount on it.

Say, $10000.

So that there is your daily budget, to spend on these search keywords in the portfolio,

How best to bid on each keyword in order to maximize the portfolio revenue whilst constrained by budget ?

That's a blue collar problem, but not one with a blue collar solution.

**The Solution**When I solve this problem, I use an LSTM to model the conversion rate, an LMA fitter to model the highly nonlinear bid-cpc relationship, a Brent solver to optimize the portfolio by computing Lagrange multipliers, and a few nitty gritties I'd rather not go into 'cause the competition in the adtech space - its fierce.

So when the ad execs say "Can you quickly walk us through how exactly you compute the optimal bid for each keyword" - see, this quick walk - its not exactly a walk in the park, yeah ? Cause its blue dollar math no more. Legit white collar territory.

**The Opinion**Now I could indulge them with some intuition, but no, they make it amply clear - They don't want to be indulged.

They want "the exact math".

The "exact nature of the computation"

How "exactly" do I compute the optimal bid ?

Surely, they opine - it can't be so hard. SERP ads are blue collar territory. "There's got to be a simple solution" !

**Work required to have an opinion**Now, having opinions is easy. We all have them.

So these execs think the underlying math for optimized portfolio bidding must be darned simple.

That's their opinion.

Lets run with that opinion.

I bravely mount a frontal assault on the problem, explaining each piece of the puzzle best I could. Soon I realize I have no chance. These are folk who don't know what a Jacobian is. Don't know Gradient Descent. Don't know Backpropogation. Don't know time series. Hell, don't know what a non-convex function is . Don't even know what slope means.

Further, they are simply unwilling to do the work required!

I have no issues if one doesn't understand something. We were all there once. Every one of us.

However, if one is simply unwilling to do the work required, and yet demands understanding on a platter, that's a huge problem.

Charlie Munger often talks about the work required to have an opinion.

Entertaining an opinion without doing the work required to justify the opinion is akin to letting a stranger occupy your apartment rent-free.

Say you had some free space, and you'd like to rent it out.

You find prospective tenants, call up references, look up employment history, finally pick somebody reliable to live in that space.

Why don't you just walk outside and ask a random stranger to park his ass rent-free in your apartment for as long as he wants ?

Because space is dear.

**Now if you think the real estate in your apartment is so expensive it justifies all this work involved in finding the right tenant, surely you'd agree the real estate in your precious brain is 100x more expensive! Why would you park any random opinion in your brain without doing the work upfront that justifies the opinion's presence ?**Yet, one happily harbors all sorts of incorrect notions on every topic under the sun without willing to do the work required.

**Avoiding stupidity is easier than seeking brilliance**I realize these execs are seeking a dramatic flash of insight into a complicated problem - one beyond their current levels of intellect.

They want an aha! moment. Something that tells them - "I get it!"

However, that demands 3-4 semesters of undergrad math, not something compressible into a half hour.

Well, Munger also said its far simpler to avoid stupidity than seek brilliance.

So I steer them towards a non-brilliant problem.

So they don't have to be brilliant - lets see if they can just avoid being stupid.

**The simpler problem**I addressed the execs - Lets assume we're solving this problem way back in the 1990s, yeah ? So, we don't use neural nets, no time series, very straightforward blue collar math.

__Solvable by hand, no computer required__!Say you have a portfolio of 5 keywords. Not the typical 10,000 keywords. Just 5.

5 lousy keywords i.e. { red shoe, blue pant, green skirt , pink tie, satin pantyhose }, sitting in a portfolio, with a $1000 daily budget.

So,

n = 5, daily budget = $1000

Let

cpc = sqrt(bid)

clicks = sqrt(cpc)

conv = 0.01 * i * click, for i = 1 to 5

i.e. the first keyword has a 1% conversion rate, the second 2%, the third 3%, the fourth 4% and the fifth has a 5% conversion rate.

rev = i * 100 * conv, for i = 1 to 5

ie. the first keyword leads to $100 in revenue per conversion, the secind $200, the third $300 etc.

Now, find the maximum revenue this portfolio generates & I'll personally give you $1000 right now, out of my pocket!

Who doesn't like a challenge ?

So the execs quickly crack open their laptop, and I step out for a coffee.

"Hey, where are you going ? We'll be done in a minute!"

Yeah sure!

I'm back in a half hour.

They are still going at it.

I glance at one of the laptops.

This exec has Microsoft Excel going, with columns for bid, cpc, click etc.

His rows are filled with lots of numbers & the total budget doesn't seem to add up.

Another exec must have been a programmer in his past life.

He has written a series of nested for-loops, 5 levels deep, that scan every bid from $0 to $100 in 5 cent increments!

The program seems to be either buggy or too slow, for he doesn't yet have a solution.

The third exec has given up. His notepad is full of calculations scratched out.

"How do you solve the darned thing ?!", he exclaims.

"Well, you have a kid in college, yeah ?" I ask.

"Sophomore", he says.

"I bet your kid could solve this under 5 minutes."

Now he's really pissed.

"You are telling me my kid can solve this problem, but I can't ?"

Yes!

"How come ?"

"Its just straigtforward Calculus. Its like, really really basic calc".

"Hey, I know calculus, I went to college too!"

"Yeah, but when was the last time you used calculus ? You don't know what the slope of a function is, how the heck are you going to optimize bidding ?!"

**Solution to the Simpler Problem**

So I solve the problem on the whiteboard.

For the first keyword,

rev1 = 100 * conv1 = 100 * 0.01 * click1 = sqrt(cpc1)

cost1 = cpc1 * click1 = cpc1 * sqrt(cpc1) = cpc1 ^ (3/2)

Keyword #2:

rev2 = 200 * conv2 = 200 * 0.02 * click2 = 4*sqrt(cpc2)

cost2 = cpc2 * click2 = cpc2 * sqrt(cpc2) = cpc2 ^ (3/2)

Keyword #3:

rev3 = 300 * conv3 = 300 * 0.03 * click3 = 9*sqrt(cpc3)

cost3 = cpc3 * click3 = cpc3 * sqrt(cpc3) = cpc3 ^ (3/2)

Keyword #4:

rev4 = 400 * conv4 = 400 * 0.04 * click4 = 16*sqrt(cpc4)

cost4 = cpc4 * click4 = cpc4 * sqrt(cpc4) = cpc4 ^ (3/2)

Keyword #5:

rev5 = 500 * conv5 = 500 * 0.05 * click5 = 25*sqrt(cpc5)

cost5 = cpc5 * click5 = cpc5 * sqrt(cpc5) = cpc5 ^ (3/2)

So total revenue = sqrt(cpc1) + 4*sqrt(cpc2) + 9*sqrt(cpc3) + 16*sqrt(cpc4) + 25*sqrt(cpc5)

total cost = cpc1 ^ (3/2) + cpc2 ^ (3/2) + cpc3 ^ (3/2) + cpc4 ^ (3/2) + cpc5 ^ (3/2)

Your lagrangian then becomes L = total rev + lagrange multiplier lambda * (budget - total cost)

L = sqrt(cpc1) + 4*sqrt(cpc2) + 9*sqrt(cpc3) + 16*sqrt(cpc4) + 25*sqrt(cpc5) + lambda * (1000 - (cpc1 ^ (3/2) + cpc2 ^ (3/2) + cpc3 ^ (3/2) + cpc4 ^ (3/2) + cpc5 ^ (3/2)))

Find the gradients w.r.t. each variate and set to zero.

dL/dcpc1 = 1/(2*sqrt(cpc1)) - lambda * 3/2 * sqrt(cpc1) = 0

A little bit of algebra yields

6 * lambda * cpc1 = 2

The other partials are just as simple:

6 * lambda * cpc2 = 8

6 * lambda * cpc3 = 18

6 * lambda * cpc4 = 32

6 * lambda * cpc5 = 50

Stare at this mess until you realize the obvious:

cpc2 = 4 * cpc1

cpc3 = 9 * cpc1

cpc4 = 16 * cpc1

cpc5 = 25 * cpc1

Good!

Since cpc = sqrt(bid) and the square root is a monotonically increasing function,

we've also learnt that we should bid much more if it converts faster.

Since keyword2 i.e. the blue pant, has four times the cpc of a red shoe, we bid 16x for the pant.

Not clear ?

Say cpc = sqrt(bid)

Then 4*cpc = 4*sqrt(bid) = sqrt(16*bid) => bid2 = 16*bid1

All well & good. We have a relationship between bids! But how much precisely do we bid ?

Ah, that's easy - because we have one more partial at our disposal.

dL/dlambda = (1000 - (cpc1 ^ (3/2) + cpc2 ^ (3/2) + cpc3 ^ (3/2) + cpc4 ^ (3/2) + cpc5 ^ (3/2))) = 0

Given the cpc relationships, this immediately evaluates to

(cpc1 ^ (3/2)) * (1 ^ (3/2) + 4 ^ (3/2) + 9 ^ (3/2) + 16 ^ (3/2) + 25 ^ (3/2)) = 1000

All done!

Now, if you really want to know how much that works out to in dollars & cents, fish out your REPL calculator:

scala> math.pow(1000/(1 to 5).map{x=>math.pow(x*x,3.0/2.0)}.sum, 2.0/3.0)

res: Double = 2.703200886921511

So cpc1 = $2.7 , cpc2 is 4x that, cpc3 9x, & so forth.

You can verify that the costs add up to exactly $1000

To find the optimal bids, simply square the cpcs ie. bid1 = cpc1^2 = $7.3, bid2 = cpc2^2 etc.

The max. revenue the portfolio generates =

sqrt(cpc1) + 4*sqrt(cpc2) + 9*sqrt(cpc3) + 16*sqrt(cpc4) + 25*sqrt(cpc5) =

sqrt(cpc1) + 4*sqrt(4*cpc1) + 9*sqrt(9*cpc1) + 16*sqrt(16*cpc1) + 25*sqrt(25*cpc1) =

sqrt(cpc1) * ( 1 + 8 + 27 + 64 + 125 )

scala> math.sqrt(2.7) * ( 1 + 8 + 27 + 64 + 125 )

res: Double = 369.7

So, about $370

Now, try as you might, you can't do any better!

You bid more, you blow the budget !

You bid less, your revenue drops !!

So there's your unique solution !!!

**Reaction**The execs take a minute to digest all this math.

Wow! says one.

OK !! thumps another.

Hold on - says the third.

So you spend $1000 and make only $369 - that's a loss.

Sure. In this artificial problem, with 5 keywords modeled by very simple curves, you lose money.

In reality, there are 1000s of keywords with sophisticated functions fitting them.

So you make a healthy profit.

"So is this what you do ? Is this our optimization model ?"

"Ummm. not quite. There are several problems with this model."

"Problems ? After all this math ?"

**Realization**Yeah.

The conversion rate per keyword isn't a constant, like this model implies.

It varies. Sometimes, a lot.

The revenue per click varies. A lot.

The cpcs & clicks aren't simple square root functions - they are quite complicated non-convex forms.

So I employ LMA curvefitters & time series & neural nets....but that complicates the math 100x, which is why I won't go into all that now.

So yeah, you may claim to run a blue collar business.

But underneath all that macho blue exterior is some fairly polished white collar math, that makes the whole thing tick.

**For it is a well known adage that blue collar math can pose many a question, which even white collar math struggles to answer.**Shrug.