High Scalability

Business Case for Serverless

High Scalability - Mon, 2017-02-27 16:56

You can’t pick a technical direction without considering the business implications. Mat Ellis, Founder/CEO of Cloudability, in a recent CloudCast episode, makes the business case for Serverless. The argument goes something like:

  • Enterprises know they can’t run services cheaper than Amazon. Even if the cost is 2x the extra agility of the cloud is often worth the multiple.

  • So enterprises are moving to the cloud.

  • Moving to the cloud is a move to services. How do you build services now? Using Serverless.

  • With services businesses use a familiar cost per unit billing model, they can think of paying for services as a cost per database query, cost per terabyte of data, and so on.

  • Since employees are no longer managing boxes and infrastructure they can now focus entirely on business goals.

  • There’s now an opportunity to change business models. Serverless will make new businesses economically viable because they can do things they could never do before based on price and capabilities.

  • Serverless makes it faster to iterate and deploy new code which makes it faster to find a proper product/market fit.

  • Smaller teams with smaller budgets with smaller revenues can do things now that only big companies could do before. Serverless attempts to industrialise developer impact.

  • Consider WhatsApp, which sold to Facebook for $19 billion with only 55 employees. If we’re going to see the first single employee billion user multi-billion dollar valuation startup it will likely be built on Serverless.

Categories: High Scalability

Stuff The Internet Says On Scalability For February 24th, 2017

High Scalability - Fri, 2017-02-24 16:56

Hey, it's HighScalability time:

 

Great example of Latency As A Pseudo-Permanent Network Partition. A slide effectively cleaved Santa Cruz from the North Bay by slowing traffic to a crawl.
If you like this sort of Stuff then please support me on Patreon.
  • 40 TFLOPS: on Lambda; 7: new habitable planets with good beer; dozens: balloons needed in Loon network; 500 TB/sec: rate at which DNA is copied in human body; 1/2: web is encrypted; 34: regions in Azure; $8k: cost of Tesla self-driving hardware; 99.95%: DMCA takedowns are bot BS; 300 nanometers: new microscope; 7%: AMP traffic to publishers; 

  • Quotable Quotes:
    • @jasonlk: Elon Musk: Self-Driving Car Revolution Will Leave 15% of World Population Without Jobs
    • Near death Archimedes: Stand away, fellow, from my diagram!
    • rumpelstilskin21: Angular and React make for popular headlines on reddit but unless you are working for a major, large web site where such things might be deemed useful by management (and no one else) then quit trying to get educated by the amateurs on reddit.
    • StorageMojo: There is a new paradigm about to hit the industry, which will eviscerate large portions of the current storage ecosystem. Like other major shifts, it is powered by a class of users who are poorly served by existing products and technologies. But if our digital civilization is to survive and prosper, it has to happen. And it will, like it or not.
    • ThatMightBePaul: Worst case scenario: you try Go, don't like it, and you head back to Node more confident that it fits you better. That's still a pretty positive outcome, imo. So, invest the time in Go, and then see which feels right :)
    • Russ: it is the job of the application to properly figure out the network’s limits and try to live within them.
    • World's Second-Best Go Player: After humanity spent thousands of years improving our tactics, computers tell us that humans are completely wrong. I would go as far as to say not a single human has touched the edge of the truth of Go.
    • @mjpt777: After fixing a few more false sharing issues we shaved another ~350ns of Aeron's RTT between machines.
    • @thomasfuchs: 1997: Let’s make a website! *fires up vi* 2007: Let’s make a website! *downloads jQuery* *fires up vi* 2017: Let’s make a website! [very long list of tech]
    • Basho: Do not follow the ancient masters, seek what they sought.
    • hellofunk: If many years ago, someone told me that a humongous company named Alphabet was thinking about deploying balloons all over the world, I'd have told you a thing or two about having a charming imagination. 
    • Russ: Sure, the Internet is broken. But anything we invent will, ultimately, be broken in some way or another. Sure the IETF is broken, and so is open source, and so is… whatever we might invent next. We don’t need a new Internet, we need a little less ego, a lot less mud slinging, and a lot more communication. 
    • @sAbakumoff: Analyzed the sentiment of 80000 Github Commit Comments, it seems that Ruby devs tend to be pretty positive, but c++ are angriest ones!
    • Michael Sawyer: The YouTubers' common enemy is YouTube
    • @jannis_r: "Good size for a microservice: if it fits into one engineers head" @adrianco #AWSTechBreakfast
    • packagecloud: setting [TZ] environment variable can save thousands (or in some cases, tens of thousands) of unnecessary system calls that can be generated by glibc over small periods of time. 
    • @istanboolean: "Hardware has stopped getting faster. Software has not stopped getting slower." @rob_pike
    • Greg Meddles: You're out of memory on some particular Amazon instance, so you bump up to the next biggest in size. That is always the naive solution. Whatever you're doing, you'll usually end up doing more of it. Eventually, you'll end up throwing good money after bad.
    • @viktorklang: Replace the use of sequential, concurrent, and parallel with dependent, coordinated, and independent? Thoughts?
    • Coast Guard Vice Adm. Marshall Lytle: Cyberwarfare is like a soccer game with all the fans on the field with you and no one is wearing uniforms
    • CockroachDB: If you’re serious about building a company around open source software, you must walk a narrow path: introduce paid features too soon, and risk curtailing adoption. Introduce paid features too late, and risk encouraging economic free riders. Stray too far in either direction, and your efforts will ultimately continue only as unpaid open source contribution
    • Veratyr: Deployment [of k8s] is just so much harder than it should be. Fundamentally (I discovered far later on in the process), Kubernetes is comprised of roughly the following services: kube-apiserver, kubelet, kube-proxy, kube-scheduler, kube-controller-manager. The other dependencies are: A CA infrastructure for certificate based authentication, etcd, a container runtime (rkt or Docker) and CNI.
    • @jbeda: I want to go on record: the amount of yaml required to do anything in k8s is a tragedy. Something we need to solve. 

  • What do you get for $5? Quite a lot. $5 Showdown: Linode vs. DigitalOcean vs. Amazon Lightsail vs. Vultr: Linode’s new plan is not only offering the consistently better performance...Linode is still a bit behind the curve when it comes to things like block storage volumes, default SSH keys and yeah, their UI.

  • Another wonderful engineering post from Riot Games. Under the hood of the League Client's Hextech UI: Any given build of the League client is expressed as a list of units called plugins... Back-end plugins that deal purely with data are written as C++ REST microservices...front-end plugins that deal with presentation are written as Javascript client applications and run inside Chromium Embedded Framework...The League client update really is a desktop deployment of an entire constellation of microservices...APIs are thoughtfully designed, any arbitrary combination of features can run cooperatively...In the League client, the common pattern is for dependencies to flow upwards...a WebSocket that allows the front-end plugins to observe back-end plugins for changes...To make implementation of complex video-based elements simpler, we created a state machine library based on Web Components...League client is patched out to players’ local drives, it doesn’t have the same immediate bandwidth constraints...we provide a number of purpose-specific audio channels - UI SFX, Notifications, Music, Voiceover, etc. - through a plugin dedicated to managing audio...We use straight-up native Custom Elements with heavy usage of Shadow DOM.

  • Does insurance cover this? The first SHA1 collision.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability

Scaling @ HelloFresh: API Gateway

High Scalability - Mon, 2017-02-20 16:56

HelloFresh keeps growing every single day: our product is always improving, new ideas are popping up from everywhere, our supply chain is being completely automated. All of this is simply amazing us, but of course this constant growth brings many technical challenges.

Today I’d like to take you on a small journey that we went through to accomplish a big migration in our infrastructure that would allow us to move forward in a faster, more dynamic, and more secure way.

The Challenge

We’ve recently built an API Gateway, and now we had the complex challenge of moving our main (monolithic) API behind it — ideally without downtime. This would enable us to create more microservices and easily hook them into our infrastructure without much effort.

The Architecture
Categories: High Scalability

Stuff The Internet Says On Scalability For February 17th, 2017

High Scalability - Sun, 2017-02-19 18:21

Hey, it's HighScalability time:

 

Gorgeous satellite images of a thawing Greenland (NASA).
If you like this sort of Stuff then please support me on Patreon.
  • 1 cubic millimeter: computer with deep-Learning; 1,600: data on nearby stars; 40M: users for largest Parse app; 58x: Tensorflow 1.0 speedup on 64 gpus; 46%: ecommerce controlled by Amazon; 60%: IT growth in public cloud; 200 TB: one tv episode; 

  • Quotable Quotes:
    • @krishnan: Serverless will not be around in 5 years. It will be AI coding AI coding Ai....... Serverless or not doesn't matter #RunForrestRun
    • user5994461: Amazon: Create usual services and sell them. Google: Make unique products that push the boundaries of what was previously thought possible. Amazon: Don't care about inefficiencies and usage. Inefficiencies can be handled by charging more to the clients, usage doesn't matter because the users are mostly the clients and they don't feel their pain. Google: Had to make all their core technologies efficient, performant, scalable and maintainable or they couldn't sustain their business.
    • Hans Rosling: To me, the impressive thing is that people succeed at all.
    • @littleidea: Google Spanner didn't beat CAP, just mitigated the hell out of P
    • @jordw: Cloud Spanner is a very well-engineered CP database that is also very good at being available.
    • Cade Metz: The AI Threat Isn’t Skynet. It’s the End of the Middle Class
    • hosh: Four years ago, I determined that while development work might seem to be near the top of the food chain, there will at some point where my work will be replaced by AIs.
    • mi100hael:  I found Go's "simplicity" to be limiting and frustrating when it came to building production applications. Things like the weird split between functions returning errors but occasionally panicking, lack of inheritance, and poor dependency management through github links make Go a poor choice for applications within a business setting. 
    • @NathanTippy: New #Java web server clearing 1 million HTTP requests per second on 4 core box.  Can run in < 100MB of memory.
    • @kellabyte: It doesn’t matter what the founder or developer of a database tells you. It’s about the true peopeties it guarantees.
    • @swardley: Private cloud starting to drop, public cloud a three horse race - AWS 1st, MSFT 2nd, GooG 3rd ... sensible stuff 
    • @ollekullberg: Kullberg's law: when we increase the size of a microservice we increase the benefit of static typing for this microservice.
    • @swardley: ... it's not lack of engineering capability or finance or market or marketing or branding, the real story of cloud is executive failure.
    • katied: Trophic cascade is a process that starts at the top of a food chain and works its way to the bottom of it. So, even though as predators wolves survive by taking life, they also have the ability to create it.
    • @swardley: Cloud wars in IaaS - oh, please. War was well over in 2012, yes there will be price cuts as constraints are reduced but there is no battle.
    • @HenryR: 1. CAP has always said only one thing: that there is always a particular network failure that forces you to give up either C or A. 2. It has nothing at all to do with how likely that failure mode is. The failure is system-specific. 
    • throwawaydbfif: The movement from ownership to renting on the web is absolutely terrifying to me. Within the span of a few years we've gone from owning our technology to renting it out from a big players for monthly fees that we cannot completely predict or control.
    • computerex: People use cloud computing because it already is massively impractical to run your own servers. Hardware is hard to run and scale on your own and experiences economies of scale. This principle is seen everywhere and can hardly be viewed as something controversial. 
    • stuckagain: You did not ever own your own globally consistent, massively scalable, replicated database. The fact that you can now rent one by the hour is strictly an improvement for you, if you need that kind of thing
    • tedd4u: Aurora is very cool but won't help you much after you vertically scale your master and still need more write capacity. With Cloud Spanner you get horizontal write scalability out of the box. Critical difference.
    • @koivimik: REST != CRUD via HTTP #microXchg @olivergierke
    • Linus: It's almost boring how well our process works. All the really stressful times for me have been about process. They haven't been about code. When code doesn't work, that can actually be exciting ... Process problems are a pain in the ass. You never, ever want to have process problems ... That's when people start getting really angry at each other.
    • @littleidea: Almost every task run under Borg contains a built-in HTTP server that publishes information about the health of the task...
    • W. Daniel Hillis: For Richard [Feynman], figuring out these problems was a kind of a game. He always started by asking very basic questions like, “What is the simplest example?” or “How can you tell if the answer is right?” He asked questions until he reduced the problem to some essential puzzle that he thought he would be able to solve.
    • @ewolff: "Every hackathon uses Lambda. They build really complicated, production-ready systems in 12h" @adrianco at @microXchg
    • Daniel Bryant: The term "microservices" itself will probably disappear in the future, but the new architectural style of functional decomposition is here to stay.
    • @rbranson: The NoSQL movement might be a disappointment, but emerging from the rubble is the log-based (i.e. Kafka) model that actually works.
    • Chip Overclock: Surprisingly, GPS satellites actually know nothing about position. What they know about is time.
    • @codinghorror: I look at my old blog posts and think... there was a time when I believed 24GB was a lot of RAM
    • vidarh: Depending on your workloads, DO servers can come out cheaper or more expensive than AWS, but bandwidth at DO is so much cheaper than AWS that for bandwidth intensive stuff I can't serve entirely out of Europe (where Hetzner is vastly cheaper than DO again), DO is often a much cheaper alternative. Sometimes we use it as a cost-cutting do-it-yourself CDN in front of AWS for clients that insist on S3 for storage (and again where we can't just cache everything in Europe for latency reasons). For bandwidth heavy applications, you can pay for significant numbers of Droplets from the AWS bandwidth savings alone.
    • lobster_johnson: we use Google Container Engine (hosted Kubernetes), with Salt for the non-GKE VMs. This is needed because K8s is not mature enough to host all the things. In particular, stateful sets are still in beta. 
    • anonymous: The overall impact [algorithms] will be utopia or the end of the human race; there is no middle ground foreseeable. I suspect utopia given that we have survived at least one existential crisis (nuclear) in the past and that our track record toward peace, although slow, is solid.
    • keenio: In conclusion, the TCO is probably significantly lower for Kinesis. So is the risk. And in most projects, risk-adjusted TCO should be the final arbiter.
    • Adem Efe Gencer: the weekly [Bitcoin] mining power of a single miner has never exceeded the 30% of the overall mining power in 2016. Morever, in the second half of the year, the highest mining power has consistently been under the 20% range.
    • David Rosenthal: The security downside of Postel's Law is even more fundamental. The law requires the receiver to accept, and do something sensible with, malformed input. Doing something sensible will almost certainly provide an attacker with the opportunity to make the receiver do something bad.
    • douche: That's pretty much the way it has always been. You can go back at least to the Civil War and find politics has had more to do with procurement than performance of the weapon systems in question.
    • Jonathan Suen: While the brain and the Internet clearly operate using very different mechanisms, both use simple local rules that give rise to global stability. I was initially surprised that biological neural networks utilized the same algorithms as their engineered counterparts, but, as we learned, the requirements for efficiency, robustness, and simplicity are common to both living organisms and the networks we have built.
    • Bruce Johnson: Code reviews set the tone for the entire company that everything we do should be open to scrutiny from others, and that such scrutiny should be a welcome part of your workflow rather than viewed as threatening.
    • codingmyway: I think some miners are against any increase because it will lower fees. Without a blocksize limit fees tend to zero, which is fine while there is the block reward but they still want to milk the congestion fees. To say they are pro segwit or pro unlimited is bluffing. They are pro status quo and congestion and high fees.
    • edejong: Many engineers I have worked with like to throw around terms like: "CQRS", "Event sourcing", "no schema's", "document-based storage", "denormalize everything" and more. However, when pushed, I often see that they lack a basic understanding of DBMSes, and fill up this gap by basically running away from it. For 95% of the jobs, a simple, non-replicated (but backed-up) DBMS will do just fine.
    • adamu__: If China were to shut down bitcoin mining, my understanding is that the worst case scenario is much more dire. The network only adjusts the 'difficulty' relative to current network hash power every 2,016 blocks. Depending on the severity of the overall hash power reduction, new block discovery might slow down significantly. This would also delay a recalculation of the new difficulty accommodating the reduction in hash power. The network could be severely throttled for weeks.
    • boulos: Slightly off-topic, but EC2 doesn't really scale independently if you compare it to GCE. We let you combine 24 vcpus with 39 GB of RAM, 3 partitions of Local SSD and a few GPUs, all independently (though the ratio of RAM to vcpu is currently bounded between .9 and 6.5).
    • Veratyr: Personally, I settled with colocation. I pay $60/mo + $2k one-off for the initial hardware + say $150/5y/4TB HDD, which, for 80TB of storage over 5y comes out to a total of ~$88/mo, or $0.001/GBmo. 

  • Now this is object oriented programming. New software for increasingly flexible factory processes: new software that allows each individual component to tell the machine what has to be done. By breaking away from central production planning, factories can achieve unprecedented agility and flexibility... Everything would go much faster if production and the requisite machines were not rigidly set by a control program, but if every component itself knew the best way for it to be moved quickly through the process chain. 

  • Relax. Videos from TensorFlow Dev Summit 2017 are now available. Also, Learn TensorFlow and deep learning, without a Ph.D. Also also, Deep Learning book.

  • Google is Introducing Cloud Spanner: a global database service for mission-critical applications. It will be interesting to see if Spanner, as a unique hard to duplicate feature, becomes a Google Cloud differentiator. Will it make the delta between the clouds significant enough that developers choose Google? Quizlet, already running on GCP, really likes Spanner, but it's not a drop in replacement for MySQL. Like with NoSQL there's special care and feeding to make it work, but that's the sacrifice high QPS requires. Performance: "Cloud Spanner queries have higher latency at low throughputs compared with a virtual machine running MySQL. Spanner's scalability, however, means that a high-capacity cluster can easily handle workloads that stretch our MySQL infrastructure." And p90s are consistently lower than 50 ms. Cost: "For very small or low-throughput databases Cloud Spanner is overkill [min ~$8,000/yr]...Cloud Spanner comparable or slightly cheaper based on the performance in our testing."  With Spanner hitting the market maybe that will help CockroachDB? Some older articles: Spanner - It's About Programmers Building Apps Using SQL Semantics At NoSQL ScaleGoogle Spanner's Most Surprising Revelation: NoSQL Is Out And NewSQL Is InF1 And Spanner Holistically ComparedHow Google Invented An Amazing Datacenter Network Only They Could Create

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability

Sponsored Post: Aerospike, GoCardless, Auth0, InnoGames, Contentful, Stream, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

High Scalability - Tue, 2017-02-14 16:56

Who's Hiring?
  • GoCardless is building the payments network for the internet. We’re looking for DevOps Engineers to help scale our infrastructure so that the thousands of businesses using our service across Europe can take payments. You will be part of a small team that sets the direction of the GoCardless core stack. You will think through all the moving pieces and issues that can arise, and collaborate with every other team to drive engineering efforts in the company. Please apply here.

  • InnoGames is looking for Site Reliability Engineers. Do you not only want to play games, but help building them? Join InnoGames in Hamburg, one of the worldwide leading developers and publishers of online games. You are the kind of person who leaves systems in a better state than they were before. You want to hack on our internal tools based on django/python, as well as improving the stability of our 5000+ Debian VMs. Orchestration with Puppet is your passion and you would rather automate stuff than touch it twice. Relational Database Management Systems aren't a black hole for you? Then apply here!

  • Contentful is looking for a JavaScript BackEnd Engineer to join our team in their mission of getting new users - professional developers - started on our platform within the shortest time possible. We are a fun and diverse family of over 100 people from 35 nations with offices in Berlin and San Francisco, backed by top VCs (Benchmark, Trinity, Balderton, Point Nine), growing at an amazing pace. We are working on a content management developer platform that enables web and mobile developers to manage, integrate, and deliver digital content to any kind of device or service that can connect to an API. See job description.
Fun and Informative Events
  • DBTA Roundtable Webinar: Fast Data: The Key Ingredients to Real-Time Success. Thursday February 23, 2017 | 11:00 AM Pacific Time. Join Stephen Faig, Research Director Unisphere Research and DBTA, as he hosts a roundtable discussion covering new technologies that are coming to the forefront to facilitate real-time analytics, including in-memory platforms, self-service BI tools and all-flash storage arrays. Brian Bulkowski, CTO and Co-Founder of Aerospike, will be speaking along with presenters from Attunity and Hazelcast. Learn more and register.

  • Your event here!
Cool Products and Services
  • Working on a software product? Clubhouse is a project management tool that helps software teams plan, build, and deploy their products with ease. Try it free today or learn why thousands of teams use Clubhouse as a Trello alternative or JIRA alternative.

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Auth0 is the easiest way to add secure authentication to any app/website. With 40+ SDKs for most languages and frameworks (PHP, Java, .NET, Angular, Node, etc), you can integrate social, 2FA, SSO, and passwordless login in minutes. Sign up for a free 22 day trial. No credit card required. Get Started Now.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

If any of these items interest you there's a full description of each sponsor below...

Categories: High Scalability

Part 3 of Thinking Serverless —  Dealing with Data and Workflow Issues

High Scalability - Mon, 2017-02-13 16:56

This is a guest repost by Ken Fromm, a 3x tech co-founder — Vivid Studios, Loomia, and Iron.io. Here's Part 1 and 2

This post is the third of a four-part series of that will dive into developing applications in a serverless way. These insights are derived from several years working with hundreds of developers while they built and operated serverless applications and functions. The platform was the serverless platform from Iron.io but these lessons can also apply to AWS LambdaGoogle Cloud FunctionsAzure Functions, and IBM’s OpenWhisk project.

Serverless Processing — Data Diagram

Thinking Serverless! The Data
Categories: High Scalability

Stuff The Internet Says On Scalability For February 10th, 2017

High Scalability - Fri, 2017-02-10 16:56

Hey, it's HighScalability time:

 

It was a game of drones.
If you like this sort of Stuff then please support me on Patreon.
  • Half a trillion: Apple’s cash machine; 4,000-5,000: collected data points per adult in US; 10 million: gallons of gas UPS saves turning right; 2.27: Tesla 0-60 time; 40: complex steps to phone security; $2.3 billion: VR/AR investment in 2016; 18%: small players make up public cloud services market; 500°C: first chip to survive on Venus; 5 billion: ever notes; 375,000: images from The Metropolitan Museum of Art in public domain; 18 million: queries per minute against Facebook's Beringei database; 159: jobs per immigrant founder; 2.5 miles: whales breach for stronger signal; 10,000x: computers faster in 2035; 

  • Quotable Quotes: 
    • @martin_casado: Chinese factory replaces 90% of human workers with robots. Production rises by 250%, defects drop by 80%
    • Jure Leskovec: It’s [trolling] a spiral of negativity. Just one person waking up cranky can create a spark and, because of discussion context and voting, these sparks can spiral out into cascades of bad behavior. Bad conversations lead to bad conversations. People who get down-voted come back more, comment more and comment even worse.
    • sudhirj: The first concrete thing I learnt is this - implement pull first, it works 100% of the time, but may be inefficient with regards to time. Then implement push, it works 99% of the time but is much faster. But always have both running.
    • Tom Randall: California’s goal is considerable, but it’s dwarfed by Tesla’s ambition to single-handedly deliver 15 gigawatt hours 1 of battery storage a year by the 2020s—enough to provide several nuclear power plants–worth of electricity to the grid during peak hours of demand
    • @aphyr: Like I can't show that it's 100% correct, but so far I haven't found a way to break 3.4.0. Opens up a bunch of new use cases for MongoDB.
    • Azethoth666: The coming fast non-volatile memory architectures will be interesting. Everything will be in memory, but it will not go away. The infection cycle will have to clean up after itself or remain in the super fast volatile memory parts.
    • StorageMojo: In five years the specter of AWS cloud dominance will be a distant memory. The potential cloud market is enormous and we are, in effect, where the computer industry was in 1965. AWS will be successful, just not dominant. No tears for AWS.
    • @johnrobb: ~ 'Bots make public conversation a synthetic conversation. This makes it very difficult to know what consensus looks like.
    • W. Daniel Hillis: One day when I was having lunch with Richard Feynman, I mentioned to him that I was planning to start a company to build a parallel computer with a million processors. His reaction was unequivocal, “That is positively the dopiest idea I ever heard.”
    • @supershabam: Every database is a message bus if you try hard enough
    • mlechha: Boltzmann machines are a stochastic version of the Hopfield network. The training algorithm simply tries to minimize the KL divergence between the network activity and real data. So it was quite surprising when it turned out that the algorithm needed a "dream phase" as they call it. Francis Crick was inspired by this and proposed a theory of sleep.
    • @benjammingh: OH "Docker is Latin for a fire consisting predominantly of tires
    • UweSchmidt: "Real" bitcoining doesn't use services like coinbase; the coins are on your computer which you have to secure yourself. At least this is what you get told in cryptocurrency forums when one of the exchanges get hacked.
    • @axleyjc: 'Think of your System as a "Set of annotated request trees"' to manage microservice complexity @adrianco @ExpediaEng
    • @happy_roman: VW CEO on Tesla: "We'll win in the end, because of our abilities to scale & spread production."
    • aaron bell: Whichever cloud provider you pick based on your needs and their specific offering, I beg of you — please don’t try hybrid
    • zebra9978: Kubernetes introduces a lot of upfront complexity with little benefit sometimes. For example, kargo is failing with Flannel, but works with Calico (and so on and so forth). Bare metal deployments with kubernetes are a big pain because the load balancer setups have not been built for it - most kubernetes configs depend on cloud based load balancers (like ELB). In fact, the code for bare metal load balancer integration has not been fully written for kubernetes.
    • a13n: This is huge. 87-99% shared code between iOS and Android. Someday companies as big as Instagram won't need to have entire separate product teams for separate platforms.
    • David Rosenthal: I've always said that the chief threat to digital preservation is economic; digital information being very vulnerable to interruptions in the money supply. 
    • YZF: There are no channels [in C++], there are no lightweight/green threads, there's no standard HTTP library, no standard crypto libraries, no standard test framework. For certain classes of applications this makes Go significantly more productive and significantly less bug/error prone. Not to mention compile times.
    • jdwyah: Kinesis firehose to S3 and then query with Athena is pretty great. I've been very happy with the combo.
    • mcherm: Your example from RethinkDB really struck home to me. The idea that superior technology might lose out due to poor marketing or (in this case) a system that is optimized for the real world rather than being optimized for benchmarks really disturbs me.
    • Aras Pranckevičius: Moral of the story is: code that used to do something with five things years ago might turn out to be problematic when it has to deal with a hundred. And then a thousand. And a million. And a hundred billion. Kinda obvious, isn’t it?
    • kordless: I've come to a hypothesis that technology's purpose is to gently erode the concept of "self"
    • Microsoft: Close to a year ago we reset and focused on how we would actually get Git to scale to a single repo that could hold the entire Windows codebase (include estimates of growth and history) and support all the developers and build machines.
    • XorNot: I've run extensive benchmarks of Hadoop/HBase in Docker containers, and there is no performance difference. There is no stability difference (oh a node might crash? Welcome to thing which happens every day across a 300 machine cluster). Any clustered database setup should recover from failed nodes. Any regular relational database should be pretty close to automated failover with replicated backups and an alert email. Containerization doesn't make this better or worse, but it helps a lot with testing and deployment.
    • Dan Luu: When I was at Google, someone told me a story about a time that “they” completed a big optimization push only to find that measured page load times increased. When they dug into the data, they found that the reason load times had increased was that they got a lot more traffic from Africa after doing the optimizations. The team’s product went from being unusable for people with slow connections to usable, which caused so many users with slow connections to start using the product that load times actually increased.

  • There's a quintessential Silicon Valley moment in The Founder, a movie about the more interesting than expected McDonald's origin story. Brothers Mac and Dick McDonald kicked around from startup to startup. Nothing stuck. Drive-ins ruled the day, but were ripe for disruption. They were slow, used lots of servers, had too many options, attracted the wrong user base, and often delivered the wrong results. Metrics told them users mostly bought burgers, fries, and milkshakes. So the brothers decided to completely rethink how burgers were made and sold. What they came up with disrupted the food industry: a serverless drive-in based on a new low latency pipeline for making burgers called the Speedy System. An order was delivered within 30 seconds of being made; metrics helped control the latency distribution. Here's a short vignette showing how it was done. You'll love it. They traced the exact dimensions of the kitchen and conducted numerous simulations to figure out the optimal configuration. Users walk up to the window to order, so no servers. The API was narrow, only a few items could be ordered. Waste was reduced because utensils were done away with using innovative packaging design. Automation and a proprietary tool chain delivered a consistent product experience. And as often happens in Silicon Valley the founders were out maneuvered. While the McDonald brothers innovated the tech, Ray Kroc innovated the business model. Guess who ended up with everything? 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability

In-memory noSQL DBMS Client in Big Data Cluster

High Scalability - Wed, 2017-02-08 16:56

This is guest post by Sergei Sheinin, creator of the 2DX Web UI Database Cluster Framework, a low latency big data cluster with in-memory noSQL DBMS Web Browser client.

When I began working in the field of data management the disconnect between rigid structure of relational database tables and free form of documents managed by end users and their businesses stood out as a technical and managerial hurdle. On the one hand there were strict definitions of normalized relational database models and unstructured document formats on the other. Often the users in charge of changing document structures held organizational responsibilities far removed from database modeling or programming. On one occasion I was involved in a project where call center operators made on the fly decisions to update a document structure based on phone conversations with customers. Such updates had to be streamed into a relational back-end creating havoc in database structure and build of table columns.

In seeking a permanent solution I researched merits of Entity-Attribute-Value database schema and its applications. This technique proved successful in enabling front end users to modify relational-bound documents through performing updates to structure described in their metadata. However application of EAV raised its own issues, for example accommodation of updated document metadata at times required changes to definitions of the relational tables, attention of developers due to complexity of application layer in client-server interoperability, rapidly growing fact tables and performance of multiple join statements in select queries...

Categories: High Scalability

Part 2 of Thinking Serverless —  Platform Level Issues 

High Scalability - Mon, 2017-02-06 16:56

This is a guest repost by Ken Fromm, a 3x tech co-founder — Vivid Studios, Loomia, and Iron.io. Here's Part 1.

Job processing at scale at high concurrency across a distributed infrastructure is a complicated feat. There are many components involvement — servers and controllers to process and monitor jobs, controllers to autoscale and manage servers, controllers to distribute jobs across the set of servers, queues to buffer jobs, and whole host of other components to ensure jobs complete and/or are retried, and other critical tasks that help maintain high service levels. This section peels back the layers a bit to provide insight into important aspects within the workings of a serverless platform.

Throughput

Throughput has always been the coin of the realm in computer processing — how quickly can events, requests, and workloads be processed. In the context of a serverless architecture, I’ll break throughput down further when discussing both latency and concurrency. At the base level, however, a serverless architecture does provide a more beneficial architecture than legacy applications and large web apps when it comes to throughput because it provide for far better resource utilization.

In a post by Travis Reeder on What is Serverless Computing and Why is it Important he addresses this topic.

Cost and optimal use of resources is a huge reason to do serverless. If you are a big company with a bunch of apps/APIs/microservices, you are currently running those things 24/7 and they are using resources 100% of the time, no matter if they are in use or not. With a FaaS infrastructure, instead of running apps 24/7, you can execute functions for any number of apps on demand and share all the same resources. Theoretically, you could reduce waste (idle time) to almost nothing while still providing fast response time. For a FaaS provider, this cost savings is passed up to the end user, the developer. For an enterprise, this can reduce capex and opex big time.

Another way of looking at it is that by moving to more discrete tasks that can run in universal platform with self-contained dependencies, tasks can run anytime anywhere across a serverless architecture. This is in contrast to a set of stand alone monolithic applications whereby operations teams have to spend significant cycles arbitrating which applications to scale, when, and how. (A serverless architecture can also increase throughput of application and feature development but much has been said in this regard as it relates to microservices and functions as a service.)

A Graph of Tasks and Projects

The graph below shows a set of tasks over time for a single account on the a serverless platform. The overarching yellow line indicates all tasks for an account and the other lines represent projects within the account. The project lines should be viewed as a microservice or a specific set of application functions. A few years ago, the total set would have been built as a traditional web application and hosted as a long-running application. As you can see, however, each service or set of functions has a different workload characteristic. Managing the aggregated set at an application level is far more complex than managing at the task level within a serverless platform, not to mention the resource savings by scaling commodity task servers as opposed to much more complex application servers.

All Tasks (Application View) vs Specific Tasks (Serverless View)

Latency
Categories: High Scalability

Stuff The Internet Says On Scalability For February 3rd, 2017

High Scalability - Fri, 2017-02-03 16:56

Hey, it's HighScalability time:


 

We live in interesting times. F/A-18 Super Hornets Launch drone swarm.
If you like this sort of Stuff then please support me on Patreon.
  • 100 billion: words needed to train large networks; 73,653: hard drives at Backblaze; 300 GB hour: raw 4k footage; 1993: server running without rebooting; 64%: of money bet is on the Patriots; 950,000: insect species; 374,000: people employed by solar energy; 10: SpaceX launched Iridium Next satellites; $1 billion: Pokémon Go revenue; 1.2 Billion: daily active Facebook users; $7.17 billion: Apple service revenue; 45%: invest in private cloud this year; 

  • Quoteable Quotes:
    • @kevinmarks: #msvsummit @varungyan: Google's scale is about 10^10 RPCs per second in our microservices
    • language: "Order and chaos are not a properties of things, but relations of an observer to something observed - the ability for an observer to distinguish or specify pattern."
    • general_ai: Doing anything large on a machine without CUDA is a fool's errand these days. Get a GTX1080 or if you're not budget constrained, get a Pascal-based Titan. I work in this field, and I would not be able to do my job without GPUs -- as simple as that. You get 5-10x speedup right off the bat, sometimes more. A very good return on $600, if you ask me.
    • Al-Khwarizmi: Maybe I'm just not good at it and I'm a bit bitter, but my feeling is that this DL [deep learning] revolution is turning research in my area from a battle of brain power and ingenuity to a battle of GPU power and economic means
    • Space Rogue: pcaps or it didn't happen
    • LtAramaki: Everyone thinks they understand SOLID, and when they discuss it with other people who say they understand SOLID, they think the other party doesn't understand SOLID. Take it as you will. I call this the REST phenomenon.
    • evaryont: I don’t see this as them [Google] trying to “seize” a corner of the web, but rather Google taking it’s paranoia to the next level. If they can’t ever trust anyone in the system [Certificate Authority], why not create your own copy of the system that no one else can use? Being able to have perfect security from top to bottom, similar to their recently announced custom chips they put in every one of their servers.
    • David Press: The benefits of SDN are less about latency and uptime and more about flexibility and programmability.
    • Benedict Evans: Web 2.0 was followed not by anything one could call 3.0 but rather a basic platform shift...one can see the rise of machine learning as a fundamental new enabling technology...one can see quite a lot of hardware building blocks for augmented reality glasses...so the things that are emerging at the end of the mobile S-Curve might also be the beginning of the next curve. 
    • @kevinmarks: 20% people have 0 microservices in production - the rest are already running microservices
    • @joeerl: You've got to be joking - should be 1M clients/server at least
    • SikhGamer: We considered using RabbitMQ at work but ultimately opted for SNS and SQS instead. Main reason being that we cared about delivering value and functionality. Over the cost of yet managing another resource. And the problems of reliability become Amazon's problem. Not ours.
    • DataStax: A firewall is the simplest, most effective means to secure a database. Sounds complicated, but it’s so easy a government agent could do it.
    • @danielbryantuk: "If you think good architecture is expensive, try bad architecture" @KevlinHenney #OOP2017
    • Peter Dizikes: The new method [wisdom from crowds] is simple. For a given question, people are asked two things: What they think the right answer is, and what they think popular opinion will be. The variation between the two aggregate responses indicates the correct answer.
    • Philip Ball: Looked at this way, life can be considered as a computation that aims to optimize the storage and use of meaningful information. So living organisms can be regarded as entities that attune to their environment by using information to harvest energy and evade equilibrium.
    • Ed Sutton: The study shows the effectiveness of personality targeting by showing that marketers can attract up to 63% more clicks and up to 1400% more conversions in real-life advertising campaigns on Facebook when matching products and marketing messages to consumers’ personality characteristics.
    • Pete Trbovitch: Today’s mobile app ecosystem most closely resembles the PC shareware era. Apps that are offered free to download can carry an ad-supported income model, paid extended content, or simply bonus features to make the game easier to beat. The bar to entry is as low as it’s ever been 
    • @BenedictEvans: Global mainframe capacity went up 4-5x from 2000-2010. ‘Dead’ technology can have a very long half-life
    • @searls: I keep seeing teams spend months building custom infrastructure that could be done in 20 minutes with Heroku, Github, Travis. Please stop.
    • @mdudas: Starbucks says popularity of its mobile app has created long lines at pickup counters & led to drop in transactions.
    • @cdixon: Software eats networking: Nicira (NSX) will generate $1B revenue for VMWare this year
    • raubitsj: With respect to vibration: we [Google] found vibration caused by adjacent drives in some of our earlier drive chassis could cause off-track writes. This will cause future reads to the data to return uncorrectable read errors. Based on Backblaze's methodology they will likely call out these drives as failed based on SMART or RAID/ReedSolomon sync errors.

  • Well this is different. GitLab live streamed the handling of their GitLab.com Database Incident - 2017/01/31. It wasn't what you would call riveting, but that's an A+++ for transparency. They even took audience questions during the process. What went wrong? The snippets function was DDoSd which generated a large increase of data to the database so the slaves were not able to keep up with the replication state. WAL transaction files that were no longer in the production backlog were being requested so transaction logs were missed. They were starting the copy again from a known good state then things went sideways. They were lucky to have a 6 hour old backup and that's what they were restoring too. Sh*te happens, how the team handled it and their knowledge of the system should give users confidence going forward.

  • OK, this turned out to be false, but nobody doubted it could be true or where things are going in the future. Hotel ransomed by hackers as guests locked out of rooms.

  • Interesting use of Lambda by AirBnB. StreamAlert: Real-time Data Analysis and Alerting. There's an evolution from compiling software using libraries that must be in the source tree; running software that requires downloading lots of package from a repository; and now using services that require a lot of other services to be available in the environment for a complex pipeline to run. StreamAlert just doesn't use Lambda, it also uses Kinesis, SNS, S3, Cloudwatch, KMS, and IAM. Each step is both a deeper level of lock-in and an enabler of richer functionality. What does StreamAlert do?: a real-time data analysis framework with point-in-time alerting. StreamAlert is unique in that it’s serverless, scalable to TB’s/hour, infrastructure deployment is automated and it’s secure by default. 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability

Performance, Scalability, and High Availability: 3 Key Infrastructure Adaptability Requirements

High Scalability - Thu, 2017-02-02 20:31

This is a guest post by Tony Branson

Performance, scalability, and HA are often used interchangeably, and any confusion about them can result in unrealistic metrics and deployment delays. It is important to invest your time and understand the differences among these three approaches before you invest your money in resilient systems.

Performance

Categories: High Scalability

Sponsored Post: InnoGames, Contentful, Stream, Loupe, New York Times, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

High Scalability - Tue, 2017-01-31 16:56

Who's Hiring?
  • GoCardless is building the payments network for the internet. We’re looking for DevOps Engineers to help scale our infrastructure so that the thousands of businesses using our service across Europe can take payments. You will be part of a small team that sets the direction of the GoCardless core stack. You will think through all the moving pieces and issues that can arise, and collaborate with every other team to drive engineering efforts in the company. Please apply here.

  • InnoGames is looking for Site Reliability Engineers. Do you not only want to play games, but help building them? Join InnoGames in Hamburg, one of the worldwide leading developers and publishers of online games. You are the kind of person who leaves systems in a better state than they were before. You want to hack on our internal tools based on django/python, as well as improving the stability of our 5000+ Debian VMs. Orchestration with Puppet is your passion and you would rather automate stuff than touch it twice. Relational Database Management Systems aren't a black hole for you? Then apply here!

  • Contentful is looking for a JavaScript BackEnd Engineer to join our team in their mission of getting new users - professional developers - started on our platform within the shortest time possible. We are a fun and diverse family of over 100 people from 35 nations with offices in Berlin and San Francisco, backed by top VCs (Benchmark, Trinity, Balderton, Point Nine), growing at an amazing pace. We are working on a content management developer platform that enables web and mobile developers to manage, integrate, and deliver digital content to any kind of device or service that can connect to an API. See job description.

  • The New York Times is looking for a Software Engineer for its Delivery/Site Reliability Engineering team. You will also be a part of a team responsible for building the tools that ensure that the various systems at The New York Times continue to operate in a reliable and efficient manner. Some of the tech we use: Go, Ruby, Bash, AWS, GCP, Terraform, Packer, Docker, Kubernetes, Vault, Consul, Jenkins, Drone. Please send resumes to: technicaljobs@nytimes.com
Fun and Informative Events
  • DBTA Roundtable Webinar: Fast Data: The Key Ingredients to Real-Time Success. Thursday February 23, 2017 | 11:00 AM Pacific Time. Join Stephen Faig, Research Director Unisphere Research and DBTA, as he hosts a roundtable discussion covering new technologies that are coming to the forefront to facilitate real-time analytics, including in-memory platforms, self-service BI tools and all-flash storage arrays. Brian Bulkowski, CTO and Co-Founder of Aerospike, will be speaking along with presenters from Attunity and Hazelcast. Learn more and register.

  • Your event here!
Cool Products and Services
  • Auth0 is the easiest way to add secure authentication to any app/website. With 40+ SDKs for most languages and frameworks (PHP, Java, .NET, Angular, Node, etc), you can integrate social, 2FA, SSO, and passwordless login in minutes. Sign up for a free 22 day trial. No credit card required. Get Started Now.

  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

If any of these items interest you there's a full description of each sponsor below...

Categories: High Scalability

Part 1 of Thinking Serverless — How New Approaches Address Modern Data Processing Needs 

High Scalability - Mon, 2017-01-30 16:56

This is a guest repost by Ken Fromm, a 3x tech co-founder — Vivid Studios, Loomia, and Iron.io.

First I should mention that of course there are servers involved. I’m just using the term that popularly describes an approach and a set of technologies that abstracts job processing and scheduling from having to manage servers. In a post written for ReadWrite back in 2012 on the future of software and applications, I described “serverless” as the following.

The phrase “serverless” doesn’t mean servers are no longer involved. It simply means that developers no longer have to think that much about them. Computing resources get used as services without having to manage around physical capacities or limits. Service providers increasingly take on the responsibility of managing servers, data stores and other infrastructure resources…Going serverless lets developers shift their focus from the server level to the task level. Serverless solutions let developers focus on what their application or system needs to do by taking away the complexity of the backend infrastructure.

At the time of that post, the term “serverless” was not all that well received, as evidenced by the comments on Hacker News. With the introduction of a number of serverless platforms and a significant groundswell on the wisdom of using microservices and event-driven architectures, that backlash has fortunately subsided.

A Sample Use Case

Since it is useful to have an example in mind as I discuss issues and concerns in developing a serverless app, I will use the example of a serverless pipeline for processing email and detecting spam. It is event-driven in that when an email comes in, it will spawn a series of jobs or functions intended to operate specifically on that email.

In this pipeline, you may have tasks that perform parsing of text, images, links, mail attributes, and other items or embedded objects in the email. Each item or element might have different processing requirements which in turn would entail one or more separate tasks as well as even its own processing pipeline or sequence. An image link, for example, might be analyzed across several different processing vectors to determine the content and veracity of the image. Depending on the message scoring and results — spam or not — various courses of actions will then be taken, which would likely, in turn, involve other serverless functions.

Thinking at the Task Level
Categories: High Scalability

Stuff The Internet Says On Scalability For January 27th, 2017

High Scalability - Fri, 2017-01-27 16:56

Hey, it's HighScalability time:

 

Tired of noisy drones? Use the same dedrone tech used at Davos. It's the future.
If you like this sort of Stuff then please support me on Patreon.
  • 1+ trillion: messages Twitter handles per day; 695 million: Internet users in China; >350k: Twitter Star Wars bots; $90 million: value of LasVegas.com domain name; 45%: WiFi connection failure rate; 80: threads in Slack Mac OS app; 364: slides in Adrian Cockcroft's microservices deck; 5180%: increases at Etsy in daily visits to pages related to Donald Trump; 465,000: cars sold by Costco last year; 14 Million: one day of DuckDuckGo searches; 58 million: science papers online; ~3x: use of Kubernetes in production settings; 54: r3.2xlarge instances used for Reddit caching; $14 billion: Microsoft’s Azure's annual run rate; 

  • Quotable Quotes:
    • Carlo Rovelli: the basic ingredient is down there in the physical world: physical correlation between distinct variables. The physical world is not a set of self-absorbed entities that do their selfish things. It is a tightly knitted net of relative information, where everybody’s state reflects somebody else’s state. 
    • Charles Stross: There’s a saying that goes something like this: “Lieutenants study tactics, colonels study strategy, generals study logistics, and field marshals study economics.” But economists—the smart ones—study education.
    • Kirk Pepperdine: I would suggest that with 200 JVMs running on 80 core you should consider using the serial collector.
    • @alicegoldfuss: Things containers improve: - testing - deploying Things containers shit on: - security - troubleshooting - managing systems resources  Note: this is a long thread of comments, enjoy!
    • @pewinternet: In 2005, just 5% of Americans used at least one social media platform. Today, 69% do. 
    • Manu Saadia: He [Peter Thiel] was a bigger fan of “Star Wars” or “Star Trek,” Thiel replied that, as a capitalist, he preferred the former. “ ‘Star Trek’ is the communist one,” he said. “The whole plot of ‘Star Wars’ starts with Han Solo having this debt that he owes, and so the plot in ‘Star Wars’ is driven by money.
    • @asymco: Google's costs-per-click — essentially its pricing — fell 16% y/y
    • Anna MacLachlan: In order to follow best practices for performance when building PWAs [progressive web app] and otherwise, the Chrome team goes by the Rail performance model: Respond: 100ms / Animate: < 8ms / Idle work in 50ms chunks / Load: 1,000ms to interactive
    • Deepak Singh (AWS): There is a certain scale where specialized hardware and infrastructure make a lot of sense and for those who need special infrastructure, we think FPGAs are one clear way to go
    • @MarcWilczek: Containerization: 19% using it, 15% testing it, 13% considering it; 15% are curious, 38% have no plans or clue. #Cloud #CIO @interop #Docker
    • Clarke Illmatical: The death of net neutrality will severely impact IoT solutions which rely on an open internet concept.
    • @mipearson: OH "I'm the Technical Debt Fairy. If you leave technical debt under your pillowcase at night I hire away your best developers"
    • Reddit:  When you vote, your vote isn’t instantly processed—instead, it’s placed into a queue. Depending on the backlog of the queue, this can mean if you were to vote and quickly refresh the page, your vote may not have been processed yet, and it would appear that your vote had been reverted. 
    • Martin Kleppmann: in a 8,000-node cluster, the chance of permanently losing all three replicas of some piece of data (within the same time period) is about 0.2%. Yes, you read that correctly: the risk of losing all three copies of some data is twice as great as the risk of losing a single node!
    • Tammy Everts: Always remember that if you’re competing online, you’re competing with Amazon.
    • Marco Arment: I'm no spending more on [Apple] search ads than I am servers.
    • dijit: the big issue with databases I've worked with is not how many inserts you do per second, even spinning rust, if properly reasoned can do -serious- inserts per second in append only data structures like myisam, redis even lucene. However the issue comes when you want to read that data or, more horribly, update that data. Updates, by definition are a read and a write to commuted data, this can cause fragmentation and other huge headaches. I'd love to see someone do updates 1,000,000/s
    • @m0biusloop: things kubernetes can't do: ipv6, multiple host networks, prefix based policy, egress policy.
    • Dr Zhou: What is really surprising is our questioning on the whole effort of bot detection in the past years. Suddenly we feel vulnerable and don't know much: how many more are there? What do they want to do?
    • Marianne Bellotti: 15 years ago, everybody was telling us ‘Get off the mainframe, get on AT&T applications, build these thick clients. Mainframes are out.’ And now thick clients are out, and everybody’s moving to APIs and microservices, which basically are very similar to the thin client that a terminal uses to interact with a mainframe.
    • @garybernhardt: Consulting service: you bring your big data problems to me, I say "your data set fits in RAM", you pay me $10,000 for saving you $500,000.
    • @jennschenker: #DLD17: BMW says it will evolve from being a car maker to a mobility services company.
    • Nick Craver (StackOverflow): We try to be boring. Boring is stable ...scalable. The simpler something is, the higher it scales...We are not against anything. We have loyalty to nothing. If there's a better option that comes along, move to it!
    • Romesberg: evolution works by starting with something close, and then changing what it can do in small steps
    • bitwiseand: The CAP theorem states that in the event of a network-partition you have to choose one of C or A. More intuitively, any delay between nodes can be modeled as a temporary network partition and in that event you have but two choices either wait to return the latest data at a peer node (C) or return the last available data at a peer node (A).
    • Gvaireth: We just had a discussion in the team, and we decided, that we need add-one microservice that would get a number and return the number increased by one. A nice separation of concerns in modern distributed web application :)
    • Russ Cox: When I first started thinking about generics for Go in 2008, the main examples to learn from were C#, Java, Haskell, and ML. None of the approaches in those languages seemed like a perfect fit for Go. Today, there are newer attempts to learn from as well, including Dart, Midori, Rust, and Swift.
    • RaptorXP: Do your virtual reality wearables usually connect to deep learning drones on the blockchain?
    • Twitter: Hadoop: We have multiple clusters storing over 500 PB divided in four groups (real time, processing, data warehouse and cold storage). Our biggest cluster is over 10k nodes. We run 150k applications and launch 130M containers per day.
    • arnon: GPUs tend to lend themselves well to analytics, contrary to transactions. Specifically, columnar databases. When the columns are all of the same data type, and the data locality is high, GPUs perform /very/ well.
    • Ed Sim: Despite the amazing productivity gains from open source, AWS, microservices and other new technologies, we have seen the time to launch extending and the cost of getting a minimally viable product (MVP) out the door increasing.
    • Daniel Miessler: It’s [AMP] poisonous to the underlying concept of an open internet. If this were to become widely adopted, you’d search for something, get results, consume the content, and you’d never leave Google.

  • Great detailed discussion on all things serverless. AWS Podcast #171: Serverless Special. Serverless is an implementation detail, not an architectural pattern. If you look at serverless as just a way to run existing code then it’s an implementation detail.  If you take it as an opportunity to think about how your application could be structured then it tends more towards the architectural pattern/microservices conversation; Serverless as a concept is a spectrum not binary. Serverless is an important concept but the boundaries are not clear...

  • Information wants to be free. Sci-Hub the first pirate website in the world to provide mass and public access to tens of millions of research papers.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability

Master-Master Replication and Scaling of an Application between Each of the IoT Devices and the Cloud

High Scalability - Mon, 2017-01-23 16:56

In this article, I want to share with you how I solved a very interesting problem of synchronizing data between IoT devices and a cloud application.

I’ll start by outlining the general idea and the goals of my project. Then I’ll describe my implementation in greater detail. This is going to be a more technically advanced part, where I’ll be talking about the Contiki OS, databases, protocols and the like. In the end, I’ll summarize the technologies I used to implement the whole system.

Project overview

So, let’s talk about the general idea first.

Here’s a scheme illustrating the final state of the whole system:

I have a user who can connect to IoT devices via a cloud service or directly (that is over Wi-Fi).

Also, I have an application server somewhere in the cloud and the cloud itself somewhere on the Internet. This cloud can be anything — for example, an AWS or Azure instance or it could be a dedicated server, it could be anything :)

The application server is connected to IoT devices over some protocol. I need this connection to exchange data between the application server and the IoT devices.

The IoT devices are connected to each other in some way (say, over Ethernet or Wi-Fi).

Also, I have more IoT devices generating some telemetry data, like light or temperature readings. There can be more than 100 and even over 1,000 devices.

Basically, my goal was to make it possible to exchange data between the cloud and these IoT devices.

Before I proceed, let me outline some requirements for my system:

Categories: High Scalability

Stuff The Internet Says On Scalability For January 20th, 2017

High Scalability - Fri, 2017-01-20 16:56

Hey, it's HighScalability time:

 

Absolutely. Do we agree that the cerebellum is amazingly beautiful? (@PeppeGanga)
If you like this sort of Stuff then please support me on Patreon.
  • 900 GB: data stolen in Cellebrite hack; 99.24%: users identified by cross-browser fingerprinting; 72%: intend to migrate to a hybrid cloud; 90%: Google & Facebook ad traffic is useless; 5.2 terabytes per second: data from Australian Square Kilometre Array Pathfinder; 10 billion: searches on DuckDuckGo in 2016; $330m: Amazon's loss on Alexa; 

  • Quotable Quotes:
    • @brucel: Breaking: Programmer accused of writing unreadable code refuses to comment.
    • @asymco: Remember Android first? App Annie believes the Apple’s App Store produced about twice as much revenue as Google Play
    • @bridgetkromhout: Describing your old-timer ranting as "greybeard" just makes me want to fight you with sed & awk at twenty paces. Be there tomorrow at dawn.
    • @StevenShorrock: Root Cause Analysis is: * Acceptable for simple systems * Inappropriate for complicated systems * Ludicrous for complex systems
    • @swardley: Five years ago Amazon was worth about half of Walmart, today Walmart is worth about half of Amazon.
    • Eric Raymond: In practice, I found Rust painful to the point of unusability. The learning curve was far worse than I expected; it took me those four days of struggling with inadequate documentation to write 67 lines of wrapper code for the server.
    • @swardley: past history shows many major players won't announce they're getting into the battle until some time after war has ended
    • @benthompson: Apple wasn't billed as phone maker / Amazon wasn't billed as infrastructure provider / FB wasn't billed as portal / Snapchat wasn't billed as TV
    • Jessitron: the biggest consideration in choosing whether to use libraries or services for distribution of effort / modularization is that choice of who decides when it deploys. Who controls which code is in production at a given time.
    • Hi Ben: The disruption of TV will follow a similar path: a different category will provide better live sports, better story-telling, or better escapism. Said category will steal attention, and when TV no longer commands enough attention of enough people, the entire edifice will collapse. Suddenly.
    • @leonidasfromxiv: I also don't understand why people compare Go with Rust. If you need a GC-less programming language: Rust; if you need a board game: Go.
    • Carlo Rovelli: The world isn’t just a mass of colliding atoms; it is also a web of correlations between sets of atoms, a network of reciprocal physical information between physical systems.
    • Chris Dixon: In the beginning, hardware-focused companies make gadgets with ever increasing laundry lists of features. Then a company with strong software expertise (often a new market entrant) comes along that replaces these feature-packed gadgets with full-fledged computers. 
    • Animats: The real question is "what do we do with a lot of CPUs without shared memory?" Such hardware has been built many times - Thinking Machines, Ncube, the PS2's Cell - and has not been too useful for general purpose computing.
    • @taavet: Very unfortunate that incumbents see tech only as a way to cut costs. Versus seeing tech to offer much better products.
    • NelsonMinar: This is what security looks like when your threat model is well funded government agencies.
    • Don Norman: The solution requires a different approach to the design of automation: collaboration. Instead of automating what can be automated, leaving the rest to the driver, we must develop collaborative systems so that the driver is continually involved in giving high-level guidance, thereby always staying active, always being in the loop. 
    • Thomas Frey: It took 50 years for the world to install the first million industrial robots. The next million will take only eight. Will this cause more jobs or few jobs in the future? I'm not convinced we know the answer.
    • @jtauber: "Every shot in Piper is composed of millions of grains of sand, each one of them around 5000 polygons."
    • rackforms: my point is the current situation, basically 2 companies controlling so much traffic, seems, well, bad for small business in this country. I value what they bring to the table and fully understand why they're so popular. But is things keep on this way where does that lead the guys like me? Is this just the way it has to be? Is the dream of the open Internet already dead?
    • @sheeshee: I think I know why it's called "DevOps" - "DevOops" was too obvious... ;)
    • greenspot: The open solution to a faster mobile web would have been so easy: Just penalize large and slow web pages without defining a dedicated mobile specification. That's it. This wasn't done in the past, slow pages outperformed fast ones on the SERPs because of some weird Google voodoo ranking, heck sometimes even desktop sites outperformed responsive ones on smartphones. If they had just tweaked these odd ranking rules in way that speed and size got more impact on the overall ranking there wouldn't have been any reason for AMP—the market would have regulated itself.
    • Juergen Schmidhuber: General purpose quantum computation won’t work (my prediction of 15 years ago is still standing). Related: The universe is deterministic, and the most efficient program that computes its entire history is short and fast, which means there is little room for true randomness, which is very expensive to compute. What looks random must be pseudorandom, like the decimal expansion of Pi, which is computable by a short program. Many physicists disagree, but Einstein was right: no dice. There is no physical evidence to the contrary

  • RethinkDB is shutting down and here's the post-portem. Lessons: the database market is like Mad Max fighting in the Thunderdome; it's better to optimize for useless microbenchmarks than it is to be good; optimism isn't a strategy.

  • Apple isn't alone in using custom hardware to thwart nation state level attackers. Google Infrastructure Security Design Overview. Good overview at Google reveals its servers all contain custom security silicon. Google designs "custom chips, including a hardware security chip that is currently being deployed on both servers and peripherals. These chips allow us to securely identify and authenticate legitimate Google devices at the hardware level."  Google encrypts data before it is written to disk, to make it harder for malicious disk firmware to access data. Google uses automated and manual code review techniques. Google uses automated software and code reviews to detect bugs in software its developers write. Google scans user-installed apps, downloads, browser extensions, and content browsed from the web for suitability on corp clients. Google uses a custom version of the KVMhypervisor. Good discussion on HackerNews, where a lot of the comments are on how Google needs this level sophistication to evade the prying eyes of governments.

  • What happens when you embed machine learning into a DBMS in order to continuously optimise its runtime performance? You get Self-driving database management systems. Humans suck at tuning databases so this is just one more job AIs will eventually toss into the dust bin of history. TensorFlow was integrated inside Peleton training two RNNs on 52 million queries from one month of traffic for a popular site. Does it help?: early results are promising: (1) RNNs accurately predict the expected arrival rate of queries. (2) hardware-accelerated training has a minor impact on the DBMS’s CPU and memory resources, and (3) the system deploys actions without slowing down the application. 

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability

Sponsored Post: Contentful, Stream, Loupe, New York Times, Scalyr, VividCortex, MemSQL, InMemory.Net, Zohocorp

High Scalability - Tue, 2017-01-17 18:05

Who's Hiring?
  • Contentful is looking for a JavaScript BackEnd Engineer to join our team in their mission of getting new users - professional developers - started on our platform within the shortest time possible. We are a fun and diverse family of over 100 people from 35 nations with offices in Berlin and San Francisco, backed by top VCs (Benchmark, Trinity, Balderton, Point Nine), growing at an amazing pace. We are working on a content management developer platform that enables web and mobile developers to manage, integrate, and deliver digital content to any kind of device or service that can connect to an API. See job description.

  • The New York Times is looking for a Software Engineer for its Delivery/Site Reliability Engineering team. You will also be a part of a team responsible for building the tools that ensure that the various systems at The New York Times continue to operate in a reliable and efficient manner. Some of the tech we use: Go, Ruby, Bash, AWS, GCP, Terraform, Packer, Docker, Kubernetes, Vault, Consul, Jenkins, Drone. Please send resumes to: technicaljobs@nytimes.com
Fun and Informative Events
  • Your event here!
Cool Products and Services
  • Build, scale and personalize your news feeds and activity streams with getstream.io. Try the API now in this 5 minute interactive tutorial. Stream is free up to 3 million feed updates so it's easy to get started. Client libraries are available for Node, Ruby, Python, PHP, Go, Java and .NET. Stream is currently also hiring Devops and Python/Go developers in Amsterdam. More than 400 companies rely on Stream for their production feed infrastructure, this includes apps with 30 million users. With your help we'd like to ad a few zeros to that number. Check out the job opening on AngelList.

  • A note for .NET developers: You know the pain of troubleshooting errors with limited time, limited information, and limited tools. Log management, exception tracking, and monitoring solutions can help, but many of them treat the .NET platform as an afterthought. You should learn about Loupe...Loupe is a .NET logging and monitoring solution made for the .NET platform from day one. It helps you find and fix problems fast by tracking performance metrics, capturing errors in your .NET software, identifying which errors are causing the greatest impact, and pinpointing root causes. Learn more and try it free today.

  • Scalyr is a lightning-fast log management and operational data platform.  It's a tool (actually, multiple tools) that your entire team will love.  Get visibility into your production issues without juggling multiple tabs and different services -- all of your logs, server metrics and alerts are in your browser and at your fingertips. .  Loved and used by teams at Codecademy, ReturnPath, Grab, and InsideSales. Learn more today or see why Scalyr is a great alternative to Splunk.

  • InMemory.Net provides a Dot Net native in memory database for analysing large amounts of data. It runs natively on .Net, and provides a native .Net, COM & ODBC apis for integration. It also has an easy to use language for importing data, and supports standard SQL for querying data. http://InMemory.Net

  • VividCortex is a SaaS database monitoring product that provides the best way for organizations to improve their database performance, efficiency, and uptime. Currently supporting MySQL, PostgreSQL, Redis, MongoDB, and Amazon Aurora database types, it's a secure, cloud-hosted platform that eliminates businesses' most critical visibility gap. VividCortex uses patented algorithms to analyze and surface relevant insights, so users can proactively fix future performance problems before they impact customers.

  • MemSQL provides a distributed in-memory database for high value data. It's designed to handle extreme data ingest and store the data for real-time, streaming and historical analysis using SQL. MemSQL also cost effectively supports both application and ad-hoc queries concurrently across all data. Start a free 30 day trial here: http://www.memsql.com/

  • ManageEngine Applications Manager : Monitor physical, virtual and Cloud Applications.

  • www.site24x7.com : Monitor End User Experience from a global monitoring network. 

If any of these items interest you there's a full description of each sponsor below...

Categories: High Scalability

Wouldn't it be nice if everyone knew a little queuing theory?

High Scalability - Tue, 2017-01-17 16:56

After many days of rain one lane of this two lane road collapsed into the canyon. It's been out for a month and it will be many more months before it will be fixed. Thanks to Google maps way too many drivers take this once sleepy local road. 

How do you think drivers go through this chokepoint? 

 

 

One hundred experience points to you if you answered one at a time.

One at a time! Through a half-duplex pipe following a first in first out discipline takes forever!

Yes, there is a stop sign. And people default to this mode because it appeals to our innate sense of fairness. What could be fairer than alternating one at a time?

The problem is it's stupid.

While waiting, stewing, growing angrier, I often think if people just knew a little queueing theory we could all be on our way a lot faster.

We can't make the pipe full duplex, so that's out. Let's assume there's no priority involved, vehicles are roughly the same size and take roughly the same time to transit the network. Then what do you do?

Why can't people figure out its faster to drive through in batches? If we went in groups of say, three, the throughput would be much higher. And when one side's queue depth grows larger because people are driving to or from work that side's batch size should increase. 

Since this condition will last a long time we have a possibility to learn because the same people take this road all the time. So what happens if you try to change the culture by showing people what a batch is by driving right behind someone as they take their turn?

You got it. Honking. There's a simple heuristic, a deeply held ethic against line cutting, so people honk, flip you off, and generally make heir displeasure known.

It's your classic battle of reason versus norms. The smart thing is the thing we can't do by our very natures. So we all just keep doing the dumb thing.

 

Categories: High Scalability

Stuff The Internet Says On Scalability For January 13th, 2017

High Scalability - Fri, 2017-01-13 16:56

Hey, it's HighScalability time:

 

So you think you're early to market! The Man Who Invented VR Goggles 50 Years Too Soon
If you like this sort of Stuff then please support me on Patreon.
  • 99.9: Percent PCs cheaper than in 1980; 300x20 miles: California megaflood; 7.5 million: articles published on Medium; 1 million: Amazon paid eBook downloads per day; 121: pages on P vs. NP; 79%: Americans use Facebook; 1,600: SpaceX satellites to fund a city on Mars; 

  • Quotable Quotes:
    • @GossiTheDog: How corporate security works: A) buy a firewall B) add a rule allowing all traffic C) the end How corporate security works:A) buy a firewall B) add a rule allowing all traffic C) the end
    • @caitie: Distributed Systems PSA: your regular reminder that the operational cost of a system should be included & considered when designing a system
    • @jimpjorps: 1998: the internet means you can "telecommute" to a tech job from anywhere on Earth 2017: everyone works in the same one square mile of SF
    • Jessi Hempel: [re: BitTorrent] Perhaps the lesson here is that sometimes technologies are not products. And they’re not companies. They’re just damn good technologies.
    • giltene: My new pet peeve: "how to make X faster: do less of X" recommendations.
    • peterwwillis: It used to be you had to actually break into a system to exfiltrate all its data. Now you just make an HTTP query.
    • Laralyn McWillams: Identify problems but focus on solutions. If you become more about problems than solutions, that negativity infects your work, your team, and how you think about your career.
    • Chris Fox: Apple is 100% a boutique retailer, meaning that a human chooses which books to promote. Without that, there was no organic discovery tool where readers could find your book.
    • vytah: In fact, the 1986 [Chernobyl] disaster happened because the engineers decided to get rid of safeguards and run tests.
    • Eric Elliott: Breaking into a user’s top 5 apps is like getting struck by lightning or winning the lottery. Don’t bank on it.
    • Peter: I say the super-intelligent aliens will be powered by hyper-computation, a technology that makes our concept of computation look like counting on your fingers; and they’ll have not only qualia, but hyper-qualia, experiential phenomenologica whose awesomeness we cannot even speak of.
    • SEJeff: LVS is pretty much the undisputed king for serious business load balancing. I've heard (anecdotally) that Uber uses gorb[1] and google has released seesaw, which are both fancy wrappers ontop of LVS for load balancing.
    • k__: I have the feeling this is haunting my life. Jobs, relationships, everything. When I got something, it didn't feel that hard to get it. When I try to get something it feels impossible.
    • Nelson Elhage: One of my favorite concepts when thinking about instrumenting a system to understand its overall performance and capacity is what I call “time utilization”. By this I mean: If you look at the behavior of a thread over some window of time, what fraction of its time is spent in each “kind” of work that it does?
    • Bart Sano (Google): I can say that we are committed to the choice of these different architectures, including X86 – and that includes AMD – as well as Power and ARM. The principle that we are investing in heavily is that competition breeds innovation, 
    • aaron-lebo: This is a larger issue with developer burnout I suspect. You master one thing and there's someone standing on the corner saying..."well, actually, I've got something better" and there's a very real anxiety in that evaluation process. Does object-oriented programming suck? Are functional languages the future? Do you really want an SPA? Should you replace your C codebase with Rust... or Go? Is Bitcoin worth getting in on? etc etc
    • StorageMojo: [re: Violin’s bankruptcy] The race is not always to the swift, nor riches to the wise. By starting with software, other companies built an early lead, and now have the money and time to optimize hardware for flash.
    • nocarrier: [Why no datacenters in India?] Cost was a smaller factor than politics; the Indian government wanted the private keys for our certs in order to let FB put a POP there. That was an absolute dealbreaker, so we served India from Singapore and other POPs in nearby countries.
    • RDX: So that original post, although long and full of real examples, was not about Javascript fatigue really. Its change fatigue. Let’s be clear, if you’re picking something new, you’re making a conscious choice to grow up with it.
    • @jamesurquhart: Amazing that emergent tech that’ll revolutionize software dev is already almost a commodity utility service. #streaming #serverless #events

  • The Ethics of Autonomous Cars. The obvious revenue model is highest bidder lives. During the first few milliseconds of a crash response a real-time bidding session is created and the lowest bidder assumes the risk. That at least captures the zeitgeist of the times.

  • First Go. Now poker. DeepStack: Expert-Level Artificial Intelligence in No-Limit Poker. Thank the force humans are still unbeatable at Sabacc. 

  • Medium may be the first YA (Young Adult, think Hunger Games) style publishing outlet. YA is often written in first-person present. It's a good way to fake authenticity. Traditional publications use third-person past tense, but that's not what works best on Medium. What I learned from analyzing the top 252 Medium stories of 2016: The words “you” and “I” were by far the most common, which suggests that addressing the reader directly as an individual person is a better writing strategy than writing in third person.

  • Ben Kehoe says AWS Step Functions is not the cheap, high-scale state machines using an event-driven paradigm he has been looking for. FaaS is stateless, and AWS Step Functions provides state as-a-Service: at $0.025 per 1,000 executions, it’s 125 times more expensive per invocation than Lambda; it’s not going to be cost-effective to replace existing roll-your-own Lambda solutions; the default throttling limit for a state machine is two executions per second...it’s not built to handle massively scaled but transient event scheduling.

  • Ransomware has shifted to being a reproducible strategy. @SteveD3Since I fist covered the MongoDB hacking on Jan 3, the number of compromised DBs has surpassed 32,000. Now possibly Elasticsearch. Anything you can find basically with Shodan. Which is why we now have @GossiTheDog: Found out today firms have started doing legal contracts which specifically rule out liability if they get hit by ransomware, naming it.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability

Stuff The Internet Says On Scalability For January 6th, 2017

High Scalability - Fri, 2017-01-06 16:56

Hey, it's HighScalability time:

 

Hot rods in space. The Smith Cloud plummets towards our galaxy at nearly 700,000 mph. Vroom!
If you like this sort of Stuff then please support me on Patreon.
  • 3 of top 5: Stackoverflow questions are about Git; 3,000: four-passenger cars could serve 98 percent of NYC taxi demand; 44%: US population lives within 20 miles of Amazon fulfillment center; 72%: Amazon customers shopped using mobile device; 110%: increase in industrial control system attacks; 455: Number of scripted television series aired this year; $28.5 billion/yr: App downloads on iOS;

  • Quotable Quotes:
    • @ValaAfshar: Number of robots working in Amazon warehouses: 2016: 45,000 / 2015: 30,000 2014: 15,000 / 2013: 1,000 — @JonErlichman
    • @jason_kint: updated duopoly #s. new IAB data came out yesterday. easy to run vs earnings for goog and fb, it's evident everyone else is zero sum game. 
    • rb2k_: I also haven't seen one [company in Germany] that isn't riddled with MBA grads that mainly push Jira tickets around.
    • Joe McCann: The best software developers I know are always hacking over the holidays. True story.
    • @kaffeecoder: Sigh. Async vs blocking protocol is irrelevant. What matters is communicating with other services outside your own req/response cycle.
    • Eric Jang: It's not a coincidence that Nvidia, the literal arms-dealer of deep learning, has had a good year in the stock market.
    • @markimbriaco: Just read a comment that said "Any good codebase has every part perfectly isolated". Oh, to be young and optimistic about software again.
    • @swardley: Asked "What do I think is the biggest impact AI would have?" ... hmmm, the largest erosion of social mobility in human history?
    • The Attention Merchants: It is therefore more effective for the State to intervene before options are seen to exist. This creates less friction with the State but requires a larger effort: total attention control.
    • StorageMojo: The cloud’s collateral damage to the legacy IT vendors continues to spread. A few billion here and a few billion there, and pretty soon you’re talking real money.
    • Janakiram MSV: The key takeaway is that Amazon wants enterprises to consume EC2 while it is pushing startups and developers towards Lambda. This move from Amazon will fuel the growth of serverless computing in the industry. 
    • Maxime Chevalier-Boisvert: Edsger Dijkstra famously said, “The question of whether machines can think is about as relevant as the question of whether submarines can swim.”
    • @karlseguin: Microservices without asynchronous messaging (queues) is actually a monolith with really slow and error prone method invocation.
    • AshleysBrain: We've been using WebRTC Datachannels for multiplayer gaming in the browser in our game editor Construct 2 (www.scirra.com) for a couple of years now. Generally they work great! However the main problem we have is switching tab suspends the game, which if you're acting as the host, freezes the game for everybody. This is really inconvenient. 
    • @lstoll: 2017: Year of the return of three tier architecture.
    • @tealtan: “I will never make a racial profiling database!” *continues working on social networks, analytics, ad tech*
    • @abt_programming: Inverse bus factor: "how many developers have to be hit by a bus before a project starts to proceed smoothly?” - @gasproni
    • M.G. Siegler: The numbers speak for themselves. 2 billion words written on Medium in the last year. 7.5 million posts during that time. 60 million monthly readers now. Pageviews galore. So step 2 is simply to slap some banner ads on the site, while step 3 is to profit, right?
    • snarf21: Writing software is hard but to me the hardest part is always taking a random abstract concept from someone's mind (or worse, several people) and converting that into something "real" in a fixed timeline and budget. There will have to be lots of tradeoffs and miscues by definition. We are always making something that doesn't already exist, it is creation and creation is hard.
    • @Pinboard: Who could have foreseen the always-on home microphone might be of interest to the cops?
    • @ThePracticalDev: I heard a rumor that Santa moved over to AWS this year. Big if true.
    • Drew Purves~ “intelligence” extends beyond brains; something as simple as self-replicating RNA exhibit intelligent behavior at the evolutionary scale. The natural world is fractal, cyclic, and fuzzy...in a biosphere, every organism is a resource to another organism. That is, learning and adaptation of each organism is not independent of other organisms
    • @meatcomputer: System clocks are always accurate and increase monotonically. Timestamps from remote machines are reliable
    • pjmlp: Everything on web development feels like an hack.
    • @kelseyhightower: In my opinion Serverless does not mean FaaS. I consider any platform that hides the management of servers from the user to be Serverless.
    • Amit: one of the lessons I learned from this journey was that the tutorials work best when I've needed that technology for a real project
    • @Carnage4Life: Snapchat' copied all the worst parts of Apple's culture & seen success. More copycats to come
    • ch: So all that's missing with the decentralized web is a centralized service to aggregate the decentralized streams?
    • @mathiasverraes: "Separation of intent and implementation" is probably a much more useful programming principle than all of SOLID combined.
    • doh: We moved back and forth between AWS and GCE (based on who gave us free credits). Once we ran out, we chose GCE and never regretted it. GCE has many quirks, for instance the inconsistency between API and the UI, it misses the richness of the services offered by AWS but everything GCE does offer is just faster, more stable and much more consistent.
    • Exponential Laws: We have argued that exponential growth would not have succeeded without sustained exponential growth at three levels of the computing ecosystem—chip, system, and adopting community. Growth (progress) feeds on itself up to the inflection point.

  • Measuring a gnat's eyebrow at a billion miles. Ivan Linscott tells the thrilling story behind the development of the New Horizons probe to Pluto. a16z Podcast: New Year, New Horizons — Pluto!  Completing the probe was a close thing. Finding enough plutonium to power system almost didn't happen. Enough wasn't found so the probe had a much lower power budget than originally spec'ed, which caused the communication system to use one FPGA instead of two. You have to use radiation hardened parts. The chips sit right next to a pile of plutonium pumping out gamma rays and neutrons. The FPGA's had a capacity of a million gates, were hardened by design, and had triple redundancy. Each gate in the array is implemented in threes. They are voted in pairs. If three agree then fine. If two agree that's the value used. They fit all the code with 5 gates margin. They also had a hero's journey sourcing a high precision oscillator. And then the frightening story of when the watchdog timer timedout and put the probe in safe mode. It turned out the JPEG compression algorithm took too long to compress an image of Pluto and that caused the timeout to fire. The reason is one of those crazy testing stories. When this feature was tested the picture of the sky was darker so it took less time to compress!

  • The impulse for folks at Twitter to delay Trump's tweets and insider trade on that information must be overwhelming.

  • 33C3 (Chaos Computer Congress) videos are now available. Great overview by Chris Hager. Lots of interesting talks. You might like: Dissecting modern (3G/4G) cellular modems; Edible Soft Robotics - An exploration of candy as an engineered material; Software Defined Emissions - A hacker’s review of Dieselgate; Rebel Cities - Towards A Global Network Of Neighbourhoods And Cities Rejecting Surveillance.

  • A compelling break down of the DNC phishing attack. Making everything viewable through a generic UI and everything programmable through a scriptable API has interesting consequences @pwnallthethings: Could have hacked? Sure. Did hack? No. Let me go through why not..The hackers weren't hacking one-by-one; so URL contraction wasn't done manually. It was done via the Bitly API...Why did the hackers include this info? Same reason they contracted links via API. Because they're not hacking 1-by-1. Are hacking at scale...When hackers hack at scale, they reuse infrastructure. They make mistakes. This isn't unusual. You can piece the bits together.

  • In the game of data you want to be at the top of the data gravity well. When your are down well nothing escapes without great cost. AWS Snowball

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...

Categories: High Scalability
Syndicate content