Hacker News .hnnew | past | comments | ask | show | jobs | submit | joe-stanton's commentslogin

Heads up, your DNS is still broken! Will try emailing you a little later...


Red Badger | London, UK | Full Stack Engineer Mid&Senior | Onsite | Full Time

Independent digital consultancy known for delivery and digital transformation working on enterprise scale web applications.

We love: React/Native, Javascript, Node, Java, Ruby (more tech here: https://red-badger.com/technology)

We are on the hunt for a friendly new badger who enjoys complex problems, talking to clients and a working on a tight knit team.

X Functional teams including, Delivery and Tech Lead, Engineers, Product & UX Design and Test.

For more details please visit... https://red-badger.com/jobs?hn

Please email Laura Hasting, Community Manager if you have questions: laura.hasting@red-badger.com We also organise React London, We Love_Tech and UXD Exchange


You might be interested in this: https://github.com/gcanti/io-ts and many related projects by Giulio.


Beforehand I built something very similar to what you described for Flow (https://hackernoon.com/runtime-introspection-of-flow-types-d...).

Many others superseded this implementation, the above being one example.


There's also very similar project at https://github.com/pelotom/runtypes that our company uses. Don't have enough experience with it yet to say which is better, io-ts on runtypes.


This is a really useful article. It's a shame that so much development time is wasted on large numbers of fruitless optimisations just because they are "easy" (eg. tweaking the colour of a CTA).

That being said, I'm surprised many of the results are so negative. It would be great to also see the max uplift achieved for each category. A number of retailers I've worked with have been able to beat these uplifts by quite a bit. I wonder if it might be significantly skewed by the kind of clients Qubit has?


Two things - firstly, each of the scores you see in the key findings are just the average. We have also estimated the size of the standard deviation (see table in section 2, or appendix A). So for some treatments, large uplifts are not out of the question.

Maybe more importantly - every A/B test ever run suffers from measurement error, and usually in e-commerce this error is on the scale of the effect you are trying to measure. This means that sometimes you will 'see' massive uplifts, where in actuality most of the size of the effect was due to random noise. This is kind of the curse of e-commerce : most people have enough data to say something (we are 95% sure this test was positive), but most not with any notable precision (we are 95% sure the uplift was between +8% and +9%). Basically all the stats in this analysis is trying to remove this noise, and this is what we got.


Great to see a multilevel model used to shrink the effects. I was reading the abstract and thought, they probably didn't correct for sampling error - but you did.

I'm not an expert on plate notation, though, so I'm not sure which MLM you used. Is it basically `Revenue ~ (Covariate_1 + ... + Covariate_n | Treatment | Category)`?


It would be great to see the max uplift achieved for each category

Indeed. What matters most with these kinds of experiments isn't really the average results, but what is possible and the distribution among beneficial results only. After all, the whole point of A/B testing is to try experiments and then either keep the changes if they improve results or stay with what you've already got if the changes didn't bring an improvement. Surely all the treatments that led to negative changes would just have been discarded in practice? It's still important to see the full picture as well, if only to guide decisions about which experiments are even worth trying, but I think there's another side that doesn't fully come through here.


I think the big error in A/B testing is that expectations are quite often very unrealistic. Designers typically have a reasonably good idea about what will work and what will not. Finding 'million dollar buttons' is rare. Of course a couple of percent or even 10's of percents of improvement is nothing to sneeze at. But thinking that by A/B testing forever you're going to make a shrub grow into a tree is imo not realistic. Aside from the detail that a continuously changing user interface is often in itself a barrier to sales.

Ironically, the companies that have benefited most from A/B testing were the ones that were doing a terrible job of it in the first place so then there is lots of low hanging fruit making the consultants look good.

Yet another item often missed: A/B testing success is a direct function of the length of the lever you are pulling. If that lever commands billions of dollars then it is easy to make it pay for itself. But if you're trying to turn $10000 into $11500 then you likely are wasting your time.


> isn't really the average results, but what is possible and the distribution among beneficial results only

No, the bad results also matter: you are still spending visitors and revenues in testing out bad variants, which is part of determining the costs and benefits. Even with a bandit approach, you incur logarithmic regret in the number of variants. And testing a bad variant is common: the best category, 'scarcity', has a 16% probability of the variant being harmful. A Value of Information calculation has to take into account the harm done while testing.


(Hence the final sentence of my previous comment.)


Visa/Mastercard may well be forced to try something like this in order to survive. Banks are opening up their own API's for P2P payments (at least within the EU due to legislation). This could completely negate the need for intermediate "payment networks" (except the following).

However - I don't think you'll see a proliferation of new payment methods. The biggest problem here would be fraud mitigation, so it'd need to be a payment provider the merchant deems trustworthy enough.

Interesting times ahead. Amex are just doing the bare minimum to keep up here.


This looks good, and is sorely needed.

It seems one of Stripe's biggest risks is the impending PSD2/XS2A changes within the EU/UK. This means banks/merchants/retailers will ditch traditional card networks (and their fees) to instruct P2P payments directly. This probably opens up a host of very effective anti-fraud measures too (eg. 2FA with mobile devices).

I wonder how Stripe will react to this major change in the market?

For example: https://developer.americanexpress.com/products/accept-amex


>I wonder how Stripe will react to this major change in the market?

Probably not, as they are quite US-focused and 3D-Secure is still in closed beta. Probably better margins in the US.


I've learned a huge amount from screencasts such as destroyallsoftware.com, so I think we can agree to disagree on this point. Additionally, I think the success of Codeschool, Code Academy, Pluralsight etc. show that I'm not really in the minority here.

A number of points you mention are possible topics. Although I can't think of a single situation where "understanding several compilers" would have helped me design/maintain/troubleshoot infrastructure I'm responsible for.

But hey, looks like you're not in my target market, and that's ok!


> [...] I can't think of a single situation where "understanding several compilers" would have helped me design/maintain/troubleshoot infrastructure I'm responsible for.

Oh, sure, you don't need to understand how ELF binaries work, until you try to do anything non-trivial to them (building chroot image anyone?). You also don't need to know how Ruby or Python work with modules, but I'll want to stay away from any your system where you happen to install a random recently developed software, because it will be a mess.

> But hey, looks like you're not in my target market, and that's ok!

Of course I'm not. What you proposed is a list for novice sysadmins, except it doesn't touch the essence of the craft, focusing instead on shiny bells and whistles of limited applicability that will be obsolete five years from now.


Wow, thanks for the response Justin. Very useful to get your perspective on this. I have some teaching experience (Code Schools etc.) but this would be my first attempt at screencasting. Very useful to hear some of these issues first hand!


Feel free to ping me (email in profile) if you have any questions, want to chat, or need advice.


You're too kind! If I pursue this, I will definitely be in touch.


Couldn't this have built upon https://github.com/facebook/css-layout ?


An enormous improvement over the old dev tools! Good work FB team.


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: