Exploring Steem Scalability

in #steem7 years ago (edited)

<p dir="auto">In this post, we will address some of the concerns that have been raised regarding the increasing RAM usage of steemd nodes, as well as our future scaling plans. While the challenges associated with scaling are not something we will ever take lightly, we also think that many of the concerns have been raised due to some misunderstandings about how to properly/optimally operate steemd nodes. We will provide some guidance on this in the sections below, and we will also talk about several changes that we have in the pipeline for addressing our future projected growth. <p dir="auto"><h1>What is Scalability?<p> <p dir="auto">The Steem community is rapidly growing, and with it, so is the Steem blockchain. Growth is great, but it brings with it scaling challenges. Other projects (such as Bitcoin and Ethereum) have been stuck at a standstill with their scaling problems for years - unable to adopt any significant changes to meet the growing demands that increased usage has placed on their blockchains. Steem on the other hand has continued to rapidly evolve and is meeting these challenges head on, thereby enabling it to process more transactions than every other blockchain <em>combined. In other words, the majority of blockchain transactions occurring globally are being done on Steem. <p dir="auto">We’ve been able to do this because our team is made up of an ever-growing roster of the most talented and innovative blockchain engineers on the planet. This doesn’t make us cocky; it makes us acutely aware of the scaling challenges in front of us, and we want to assure you that we are adequately prepared to deal with them. While we are confident in our strategy, we are also eager to hear your thoughts, objections, and insights in the comments. <p dir="auto"><center><h1>A Brief History of Scaling <p dir="auto">The most critical decision with respect to scaling is where you start. The more scalable the foundation, the more scalable the stack. A stack’s ability to scale tends to have, at best, an exponential relationship to the starting point. It is incredibly rare for an architecture to go from being able to support 3,000 people to 3,000,000 people overnight. Instead, it goes from 3 to 6 to 12, etc. Starting from an architecture that was already far ahead of the pack in terms of scalability (<a href="http://docs.bitshares.org/" target="_blank" rel="nofollow noreferrer noopener" title="This link will take you away from hive.blog" class="external_link">Graphene) was a critical component of the scaling strategy. Those that failed to make similar decisions now find themselves in the difficult position of having to rebuild their foundation without damaging the entire ecosystem that was built on top of it. <p dir="auto"><h2>ChainBase and AppBase<p> <p dir="auto">The first major scalability-related upgrade was the replacement of Graphene with <a href="https://steemit.com/steem/@steemitblog/announcing-steem-0-14-4-shared-db-preview-release" target="_blank" rel="nofollow noreferrer noopener" title="This link will take you away from hive.blog" class="external_link">ChainBase. Thanks to its faster load and exit times, and increased robustness against crashes, ChainBase was critical to enabling Steem to process its current volume of transactions. <p dir="auto"><a href="https://steemit.com/steem/@steemitdev/appbase-the-next-step-forward-for-the-steem-blockchain-let-the-testing-begin" target="_blank" rel="nofollow noreferrer noopener" title="This link will take you away from hive.blog" class="external_link">AppBase, which further improves Steem’s overall scalability through modularization. AppBase will allow many components of the Steem blockchain to run independently, which will permit steemd to take better advantage of the multithreaded nature of computers, and even enable different components of the blockchain to be run on different servers - reducing the need to run the Steem blockchain on individual “high powered, high cost” servers.<span>The next major improvement that is nearing completion (thanks to the hard work of <a href="/@vandeberg">@vandeberg and the blockchain team) is <p dir="auto"><h1>Optimizing Steemd Nodes: Block Log + State File<p> <p dir="auto">With respect to operating a steemd node currently, it is critical to understand that Steem requires two data stores: the block log and the state file. The block log is the blockchain itself, written to disk. <em>It is accessed infrequently, but is critical to verifying the integrity of new blocks and reindexing the state file if needed. <p dir="auto">The state file contains the current state of Steem objects, such as account balances, posts, and votes. It is backed by disk, but accessed via a technique called memory-mapped files. This technique was introduced in December 2016 with the release of <a href="https://steemit.com/steem/@steemitblog/announcing-steem-0-14-4-shared-db-preview-release" target="_blank" rel="nofollow noreferrer noopener" title="This link will take you away from hive.blog" class="external_link">ChainBase. <p dir="auto"><h2>Everything RAM?<p> <p dir="auto">Many node operators are suggesting that servers should have enough RAM to hold the entire Steem state file, due to the fact that Steem's performance drops when the operating system begins “paging” Steem's memory, which is a common memory management technique. We want to be very clear that it is not required to run a steemd node in this way. This is certainly a valid technique for increasing the performance of reindexing the node and servicing API calls, but is only useful in a limited number of cases. In the majority of cases (including with witness, seed, and exchange nodes), it is sufficient to store the shared memory file on a fast SSD or NVMe drive, instead of in RAM. <p dir="auto"><h1>Witness and Seed Node RAM Requirements<p> <p dir="auto">When running a steemd node with only the <code>witness plugin enabled (the common configuration for witness and seed nodes), Steemit recommends 16 GB of RAM, although 8 GB is likely sufficient if your node does not need to reindex often. If the shared memory file is stored in <code>/dev/shm/, then additional RAM would be needed to hold the entire state file, but this is not a recommended configuration. To avoid the need for extra RAM, the shared memory file can be stored directly on a fast SSD or NVMe drive. <p dir="auto">A server with 8-16 GB of RAM will be slow with reindex, but it will function properly as a seed/witness node once it is up to date with the latest block. Running on a 32 GB server is ideal for optimal replay times, but it is not a requirement for a witness/seed node to properly operate. <p dir="auto"><h1>Shared Memory File Size<p> <p dir="auto">The default configuration for a steemd node stores the shared memory file in the <code>data/blockchain directory. As long as this location is on a fast enough (SSD or NVMe) drive with sufficient space, then the default setting should work. <p dir="auto">The current recommendation is to have at least 150 GB of fast SSD storage, which includes the <code>block_log (currently around 90 GB) and <code>shared_memory.bin (currently around 33 GB). These amounts will increase over time. <p dir="auto">Whenever the size of the shared memory file has increased beyond the size that is configured in the <code>config.ini file, it has been necessary to update the configuration to a larger size and restart the node. There will be a change included in the next release (Steem 19.4) that will automatically increase this limit as needed, without the need to restart the node. This will be able to be configured and turned off entirely if you want to keep your state file in <code>/dev/shm. <p dir="auto"><h1>“Full Node” Requirements<p> <p dir="auto">Nodes that are running additional API plugins (especially account history) will require more RAM to support a larger state file. A “full node” (one that is running all of the plugins) can technically run on a 64 GB server, but it will be extremely slow to reindex, and it will be slow at serving API calls because the operating system paging algorithm does not handle memory-mapped files very well. A node with 64-256 GB RAM and a fast SSD/NVMe drive may be adequate for many use cases, depending on the load. <p dir="auto"><h2>Increasing Performance on High Use Nodes<p> <p dir="auto">For more heavily used nodes, the best way (currently) to increase the performance is to have enough RAM to hold the entire database. This skips the need for paging altogether, which technically defeats the purpose of having a memory mapped file. For a node running all of the plugins <strong>except account history, this currently requires 256 GB RAM on a pre-AppBase node. <p dir="auto">A technique that we have been using to lower the memory requirements on a “full node” (one with everything <strong>including account history), is to split the API node into two servers. One server runs only “account history,” and the other server runs everything else. This allows both servers to use less than 256 GB RAM, instead of running everything on a 512 GB RAM server. We strongly recommend running account history on a dedicated node if you want a complete history for all accounts, since it eliminates the need to have a single 512 GB RAM server. <p dir="auto">Optimizing the use case of a “full node” is a top priority of ours, and one that we will talk about more in the next section. If you only need history for certain accounts though, or only care about certain operations, the hardware requirements may already be significantly reduced. <p dir="auto"><h1>Future Scaling Plans<p> <p dir="auto">We are currently working on several projects that will reduce the memory requirements of “full nodes” by moving much of the API logic into non-consensus plugins such as HiveMind and SBDS. This will allow a lot of the functionality to be run off of SSD storage, rather than in RAM - which will lower the operating costs. By offloading data to hivemind/sbds and/or RocksDB (below), we should be able to reduce the requirements for a full node down to the same requirements for a consensus/seed node, which is an important goal of ours. <p dir="auto"><h2>RocksDB<p> <p dir="auto">In addition to the non-consensus plugins, we have begun research on using alternative data stores and moving away from Chainbase. One such data store that has shown promise is <a href="http://rocksdb.org/" target="_blank" rel="nofollow noreferrer noopener" title="This link will take you away from hive.blog" class="external_link">RocksDB. <p dir="auto">RocksDB is a fast-on-disk data store with an advanced caching layer, which could further minimize latency when reading/writing to and from the disk as it is optimized for fast, low-latency storage. Used in production systems at multiple web-scale entreprises (Facebook, Yahoo, LinkedIn), RocksDB is based on LevelDB but with increased performance thanks to its ability to exploit multiple CPU cores and SSD storage for input/output bound workloads. Its use in MyRocks, for example, lead to less SSD storage use, longer SSD endurance, and more available IO capacity for handling queries. <p dir="auto"><h2>Further Modularization<p> <p dir="auto">We are also working to modularize the blockchain beyond even what was originally planned for the initial AppBase implementation, for example, by having separate services that can be run on different servers. This will allow processes to be further spread across many small servers, increasing flexibility and decreasing cost. <p dir="auto"><center><h1>Conclusion <p dir="auto">As blockchain projects continue to become more mainstream, scalability is going to become more and more of a concern. Being a scalable blockchain is not just about being able to make a one-time fix to meet the current resource challenges. It is about being prepared to meet the future challenges as well. <p dir="auto">Steem has already proven itself as the fastest and most heavily transacted public blockchain in existence, and scalability continues to remain a top focus of ours. We know that scaling challenges will never completely go away, which is why we plan to continue innovating to ensure that whatever growth comes our way - we'll be ready. <p dir="auto"><em>Team Steemit <p dir="auto">P.S. Don't forget to share your thoughts, objections, and insights in the comments!
Sort:  
There are 3 pages
Pages

Can you guys please STOP CREATING NEW ACCOUNTS until you can properly mitigate the exploiters that are creating huge botnets and siphoning rewards from the pool.

You want to talk about scaling and resources? Most of the resource problems we have around here are directly related to exploits/spammers...problems that lie directly at the feet of STINC, via recent the last few hard forks and account creation.

Correct the bad decisions you’ve made and are making that have caused/are causing most of the current problems.

This problem was already brought up to light recently, I believe that blocking new accounts creation is a poor decision, it would just throw out the baby with the bathwater.
Those who are in charge of accounts approval are making a terrible job allowing creating entire botnets and not letting real users in. They should be relieved from that task. Maybe there should be a new mechanism of approval.
Maybe it's better to make a new user create a post (like, obligatory introduction post, for instance) instead of making them wait months for approval. Then something like committee from known and established users would (cu)rate it, and make the decision if the user should be approved. It's a win-win for both parties.
One of the STEEM foundations is Proof of Brain, right? That's how it's done, no spammer can automate the process and those who can't put two words together will also be kept out. It's not like I'm hating all those Indonesian guys, writing about how happy are they to be on steemit with poor grammar and everything else, but they are not bringing any value anyway. There will probably be need for multiple groups for handling multiple languages, but that's another topic altogether.

“I believe that blocking new accounts creation is a poor decision...”

I’m not saying all account creation needs to be stopped. I’m only asking that those created via STINC’s current process be halted. There is an obvious, major malfunction with their system that needs to be addressed.

This is akin to what governments typically do. They undertake certain tasks/functions, they perform these tasks horribly, resulting in many undesired but predictable consequences, then they propose “solutions” for the very problems that they have in fact created. And all the while, everyone else (in this case users, potential users, and investors) ends up paying the price for their inefficiency and complete ineptitude.

Accounts can still be created. But it’s plainly obvious that whatever methods STINC is using are inadequate and are contributing to the creation of large bot-nets that strain the current resources and siphon rewards from the shared pool. The solution is to shut down the failing system in place until it can be corrected so that further damage is not done.

Compounding the errors/damage when it can be easily avoided is irresponsible. It’s time that STINC acknowledges their role in the mess that they have created...and then they ought to FIX IT.

Agree. Centralized account creation is a bloody hell right now. As all centralized solutions.

Yea, agreed too

Yep, as someone that has been downvoted to negatives by on abusive account holder: @haejin

I would say there is a massive loophole that needs to be fixed before @steemitblog can be taken seriously

Yeah, I can see how competing merchants or businesses would get into silly downvoting wars and whoever has the more goons wins.

And then there are the trolls...

until you can properly mitigate the exploiters

This is like asking a fat person to mitigate other fat peoples food intake ... or something.
¯|(ツ)

via recent hard forks

Not so recent, the last one was about a year ago, but I know what you mean.

Yeah, that’s correct. I edited it.

And it should also be noted that they claimed to have HF20 almost ready to go when HF19 was implemented in early June last year. Ten months later...

For reference:

https://steemit.com/steemit/@steemitblog/proposing-hardfork-0-20-0-velocity

nah, first will be 19.4 ...appbase?
Or will they do both, HF20+appbase?
Because, well, why not introduce 2 error factors

Only two would be an improvement. We’re used to a minimum of five at once around here.

Maybe it needs to be an even number? :)

I thought STINC was using a Fibonacci sequence.

Thanks for the feedback. We understand this is an important topic for many users and will be addressing non-scalability related issues in future posts.

Great. Looking forward to it.

But for the record: This is a scalability-related issue.

It's like when you have a design for a bucket.

Don't make a damn if the bucket is 5 gallon or 10 gallon when it is riddled with holes.

Better to worry about patching the holes before increasing the capacity of the bucket.

We are losing users and rewards to scammy bot masters.

Hope they get their priorities straight. I really want this platform to succeed.

So it is more important to make the system more streamline for the bots? Isn't that like removing part of your brain to make room for the tumor? I really don't understand this platform works.

Thanks for sharing! ;)

couldn't agree more sir

I don't personally know much about any of these issues, yet, but the comment here by @ats-david certainly seems like it's worthy of a reply at least. Yes?

Can we please understand each other we have different views and reasons.

I am new here, but I seen a report somewhere that like 90% of the users are Bots, so he has more than a point if that is true, it seems they have co-opted this platform to enrich themselves.

I am also new. I thought an upvote drained power from your computer? So, if that is correct, then just don't upvote a bot, but assuming your report is true, then I'm probably wrong and it's some other technical mumbo jumbo reason. Or maybe astroturfing from competitors? lol, like facebook or something, or even the CIA who rely on corporations to collect private data.

BOOGA-BOOGA-BOOGA!!

I guess it could be a million things besides greed. what report? Link would be nice. I assume they are allowed here. I wont believe you in case you are a bot ;P

wtf-am-i-rrading.jpg

Nice comment

It's called load testing ;)

bad/greedy guys will be bad/greedy and you asking wont stop them lol its kinda what they do.

criminals break laws it dose not mean we avoid arresting them, something should be done about the abuse of the upvote system

Give me vote and promote me plz

Hey team, the way you handle so many daily transactions leaves me to think you definitely know what you are doing with scaling. Keep up the good work. All we need is communities or some kind of organizing on this site and it can be a top 100 trafficked site no problem.

There are many things beyond this post that should of been covered, sure steem handles an impressive amount of transactions, does that make it the best blockchain?

we need accountability, on this platform much so with the rampant amount of bots running loose.

Absolutely Agreed !

Agreed.

Click the Image Below


Congrats, you made the @dtube #steemitminute for today!

Where do I get one of those fancy Dtube shirts?

FWIW, /dev/shm does not imply physical RAM. It is backed by virtual memory which can include swap. A properly configured system with the state file in /dev/shm will require less physical RAM and/or perform better with the same amount of RAM than one using a disk file (at least on Linux; I can't comment on other OSs). The trade-off is lack of persistence of the file across reboots (unless it is explicitly copied).

Steem full node is essentially a data-mart with a task to low-latency response to a fixed set of queries. I believe the industry-best solution to this problem for now is a RDBMS cluster with a Redis cache.
Why don't you use it?

Looking forward to seeing how this turns out. As someone running three witness node and five full nodes, these changes are long overdue and hope they provide some relief.

I hope to see more frequent detailed communication from Steemit Inc in the future like this.

Thank you for this detailed explanation of everything, as a non-techy I found I was able to grasp it much better and I'm as determined as ever that Steem has an amazing team behind it focusing on the important aspects which is the scalability of the blockchain. We are already so far ahead of many other blockchains in terms of real world use and promising apps being built on top of it.

Resteemed.

Glad you think so! Thanks!

Great work, Andrew! I'm so excited to see this regular, professional communication. I know how much work goes into creating something like this, and I really, really appreciate it. It's a huge investment of valuable time, but this communication is so important (IMO).

Thanks Luke, we’re getting better at it. Just another scaling challenge ;)

I do not understand technology. So I follow your opinion. My friends in Indonesia say you are a caring Steemian.

I have a renewed sense of confidence. For a while I've been very concerned about scalability and had some doubts about the future of Steem. Thanks for the node and RAM suggestions, I hope the new AppBase/RockDB implementations will slingshot Steem to the next level to accommodate the ever increasing user base and transactions.

Thanks to all the dev team for their relentless hard work.

Thank you for sharing your concerns with us!

This is a great explanation, especially for those a bit newer to Steem/Steemit. I have a couple of questions that the devs may be able to help with.

What is the current size of the state file of a full node minus the account history? and what about for a node running only the account history plugin?

In the current Steemit backend, is the Jussi (reverse proxy server) accepting requests, and redirecting them to RPC nodes running different plugins if they're not in the cache, or is it using the SBDS database for some things already?

Hi @andybets - thanks for your question. Currently a node with only account history uses 204GB for the state file while a node with everything except for account history uses 190GB. Today, Jussi (api.steemit.com) is being used to split requests to the different clusters of nodes if they are not already cached. In the future, many of these requests will be split off to SBDS (for account history) and Hivemind (for tags/follows). RocksDB for account history (and later probably tags/follows) will be an alternate option. Either configuration can be forwarded through Jussi without the frontend (Condenser) needing to be aware of the change.

Thanks very much - that's really useful!

You rock, Justin. I hope you're not working too hard and are positioning yourself to get that vacation we talked about in Lisbon. Thanks for all the work you and the team are putting in and the results we're all enjoying, being part of the most performant blockchain on the planet.

Thanks Luke!

Bitcoin has been here for 9 years, the core could't figure it out how to scale well. Operating cost for Bitcoin blockchain is insane, energy costs, forks, clampdowns, you would think some other chain would take a lead by now.

Steem is two years young, it is working great, we are the busiest blockchain alive, more lives have been changed for the better, anyone who has seen the potential here from the early days is quite happy by now. This is just a beginning. I'm glad STINC is thinking ahead, this just reinforces my hopes for the future. Steem on!

Thanks so much for your comment - do you have to call us 'STINC' though? Just sayin' 🙂

Oh, I thought that was just a short version for SteemitINC, not saying that you stink or something :)

STeemit INC...STINC

Seems like an appropriate abbreviation to me. It beats spelling out “Steemit Inc” every time.

How about SINC ? (shorter to type lol) 🙄

Sounds and looks better.

Yes STEEM is a great platform. It is a paradigm shift. Nothing happens overnight. And the good thing is that if we don't like something about it we can fix it. What if we all focused on fixing the issues along the way?

Steem is one of the best blockchain. However, scalability is a real issue regarding of growing community.

I would rather have a post on exploring steem development...

Steem team is amazing!

Arca.jpg

Scalability level GOD, Best regards friends of @steemitblog!!

how do we fix the bugs from spammers , creating hundreds of account, posting thousands of the same words of the article.

Very accessible explanation and update for the community. Thanks!

I have been running a witness node since September 2017. I have never put any shared_memory on RAM, all of it on a regular SATA SSD, and my RAM usage hovers under the 1 GB mark. Never had any problems at all, with disk usage around the 5% mark. Replays probably would take a while, but I have actually never replayed - not even once - since September 2017. It has just been rock solid. I've been playing around with a NVMe RAID0 server, and experimenting with /dev/shm, but can't think of a single reason to go the RAM route. Replay finished from scratch in three and a half hours, and as demonstrated above, replays are pretty rare. So, I'll stick with NVMe SSD for now, Optane next. Hopefully, with all of the scalability improvements, I won't be using RAM for years to come.

PS: I have noticed shared_memory is very compressible. I'd be looking forward to some compression tech built into Steemd. Of course, we can use existing workarounds now.

Thanks @liberosist - I really appreciate you communicating your real-world experience with running a witness node on this post.

<p dir="auto">You are also correct about the <code>shared_memory file being fairly compressible. We actually run a service that compresses state files regularly and are pulled in and uncompressed on startup. This makes it possible for us to autoscale steemd instances on-demand.

Ah, so I would like to see that tech built into Steemd itself so it always stays compressed - even when it's running. I know this will add CPU overhead, but since AMD's EPYC and Intel's Skylake-X response, we are headed into a world with more CPU cores than we know what to do with. I know some have experimented on zram on Linux, and it works fine. Anecdote - back when Steemd ran on Windows, the in-built RAM compression in Windows 10 kicked in. Back then, it was all on RAM (no shared_memory) and a full node typically used 15 GB. In Windows 10, that was down to only 3 GB, the CPU overhead was minimal and it ran flawlessly. Of course, very different times now, but I'd be looking forward to seeing compression tech built into Steemd.

RocksDB has built in compression. It will store all data on disk compressed and in a machine independent format.

Great to hear! Look forward to the transition from Chainbase to RocksDB.

@Vandeberg

<p dir="auto">on the STEEM master, I did <p dir="auto"><code>git log | grep Vand | wc -l = 1801 <p dir="auto">Appreciate your hard work. I keep seeing your commits ! <p dir="auto">I wanted check few details: <ol> <li><p dir="auto">So RockDB will be will keep the historical data and provide fast access on-demand ? <li><p dir="auto">The new transactions only can be in the memory / blockchain and thus we can get something similar to Nano blockchain ?!! <li><p dir="auto">If we achived, point2, then we may not need multi-threading for the near future ? <li><p dir="auto">Now, if two and three are true, STEEM can be a general purpse, blockchain on steroids and can easily compete with other specific chains like Hyperledger ?!! <li><p dir="auto">On Hivemind, I observed that the PostgreSQL + the communities logic is confined to single boxes. So, right now horizondal scaling is not planned ? ie, in the event of PostgreSQL hitting its limits + CPU requirement for the logic needs much more than the CPU/RAM capabilities of single nodes ? (In a nutshell the current implementation can scale vertically till some point and later for the facebook-scale, we would need to add support for horizondal scale)

@Vandeberg

Sorry for re-bumping this old comment thread, but didn't know how else to get in touch with you! I was referred to you by Julián González.

I'm José Macedo, senior analyst at AmaZix (https://www.amazix.com/ // https://www.linkedin.com/in/ze-macedo-15b1b175/). In case you haven't heard of us, we're one the largest community management, consulting and advisory companies in the crypto space, partnered with Bancor, HDAC, Bankex and many others. We've worked with over 100 projects and now have over 100 employees.

We're now working on a forum project with a token to incentivise quality content/curation. We really like the STEEM inflation algorithm for this and are very interested in building on top of STEEM. Our ideal scenario would be to use the SMT protocol, but we realise launch is scheduled for January and we simply cannot wait that long to launch.

We’re curious if we’d be able to chat to someone from the SMT team to discuss our options in the meanwhile. Currently, we’re leaning towards issuing an ERC-20 and then switching to SMT’s once they launch, but we’d love to find a way to build on STEEM from the get-go.

Let us know if you have some time to talk and discuss our use case.

Thanks,

José

@justinw - read your other comments. Keep up the good work.

I have questions / doubts:

This makes it possible for us to autoscale steemd instances on-demand.

ie you are compressing the block_log files and decompressing for auto-scalng ?

  1. Are just using xz/zip/bzip/lzma or something else ?

  2. Cursious about the auto-scaling as well - so you use something like ELB from AWS or a hardware (F5) ?

Thanks in advance.

How did you update steemd?
There has been a patch at least at the end of December/in January?
For that, recompiling and restart of steemd process, you should have needed a replay.

The update did not require a replay. Just a rebuild and restart.

great info

The exponential argument is convincing, even for those of us not-blockchain experts. It's intuitive because it agrees with the reality we are familiar with. It is easier to take a building from being able to support ten people to bring able to support twenty than two hundred. To do that, you would have needed to start with a building that could support one hundred (although the analogy isn't perfect because that you're of progression is usually at best linear).

I have been noticing some glitches since I have been on steemit. Hopefully it does turn out to be majorly scalable. Keep up the hard work. It will be interesting to see what the future holds.

Can't wait to see RocksDB in action ;-) Thank you for sharing all tech details. MOAR! deep dives please ;-)

This is fascinating stuff. I'm interested to see how Steem/Steemit can scale to many millions of users, as it must do to compete. You can't just keep adding memory forever. How much of a bottleneck is CPU?

CPU has not been the bottleneck in steemd - but with the release of appbase it can be. Making CPU the bottleneck is actually a good thing because then it becomes horizontally scalable based on usage. It gives you a metric that you can actually scale instances on - previously it was based on the number of requests, which is difficult to guess at because certain requests can take longer than others. To answer the question directly, appbase steemd can handle an extremely large number of requests before becoming bottlenecked by CPU.

Thank you for addressing the scaling challenges and how steemit working on solutions to stay ahead, excellent news!

Please continue to provide regular updates on planned developments / improvements since this gives people, including me confidence about steems future.

I have a question....this blockchain was created with 3 second transaction time...is that something that could slow down if the scaling you mention in here is not done? Would we see a backlog of transactions, potentially, like BTC and ETH experience?

Or is the scaling issues more to do with the time it takes the applications created to interact with the blockchain?

Those are great questions!

<p dir="auto">The block interval does not impact scaling in any way other than the fact that it is part of defining the maximum growth of the blockchain itself (not state size). <code>max_block_size / block_interval is the maximum growth rate for the blockchain in bytes. We are not experiencing any problems in the live performance of the blockchain. We would likely only consider changing the block interval if there was a compelling reason to for the benefit of live performance. <p dir="auto">Some of the changes we have been working on do specifically target response times for our APIs. Appbase parallelizes all API requests. Previously, requests were handled serially. This doesn't scale well and is a waste of resources as almost all computers now have multiple CPU cores. With Appbase, we are now utilizing the other 90% of our servers that weren't being touched previously.

Why not adopt MPI based code... in the steem source code... I mean... if you can have 100 cell phones.. doing a steem node... because you can't buy a BIG server! Would not that be awesome?

I mean... even if you look further... this will give STEEM a new market... which is full of performance... called HPC (High Performance Computing)... which is also inside the cloud, and reaching out for AI, deep computing and deep learning.

I am not joking, and anyone that can spin up a VM for a normal wallet, can also spin up a wallet for a an HPC cluster that can outperform many of the BIG VMs inside the cloud offering. Just check it out!

HPC is the way! No more... careless infrastructure!

I have bumped my comment (voting on it).

Good to know steemit.inc is scaling up quite nicely.I want to say a big thank you to all the witnesses investing their time and especially money on hardwares to power up the platform to unprecedented heights .You guys are the real MVPs.

Steem Scalability !!!!!

Whales are getting much more than expected, new users are finding it difficult to even sustain on this platform.

Reward pool is decreasing on every 250k blocks as well as it has fixed amount of STEEM in its reward pool. Users are flooding like hell, so What Will They Get From Here ??, Just Pennies !!

What if it onboards the same numbers of users that facebook have. How It Rewards Such A Big Numbers Of It's Users ??

Sorry to say but I think that it doesn't have the potential to compete with facebook, it doesn't have the potential to reward it's users.

I have given the idea that "Instead of using Steem Power as the method of determining the reward amount, We can bind one vote to fix amount. Steem Power will be just used to allocate bandwidth, The more Steem Power a user have the more he can post on this network.

You are missing the main point here. The more users onboard, the more investors onboard and the more 1 Steem will be worth.
With 200 million users you might only get 0.1 Steem per post, but if 1 Steem is worth 1000 dollars, you'll still get a decent reward.

1000 dollars seems a bit to far fetched, more does not mean better.

I just took a huge value for the Steem value and a low value for the reward just to make a point.
The values are not supposed to be realistic.

I figured as much but yea, I am not dying steem is going to be amazing in the near future.

The work and reward some one gets on here is better than any coin out there.

Thank You for explaining👍
Steemit is growing by the minute, everyday.

You seem to be missing the point as well. More users does not mean more investors/value.

That's not entirely true, the more users there are the more interesting Steem becomes for investors. For sure, it also depends on target groups etc. But the more users there are, the more potential target groups there are as well.

Indeed. So it doesn't mean more value. It means there might be more value. That's not the same. It's not like selling cars where selling more means more profit. I didn't mean to drag this open. Just wanted to point out there is no one to one relation.

Of course not, and I'm pretty sure that people who post nothing of value will get way less than they get now, because their share will be smaller, but that people who post things of value will probably get more since the userbase is bigger which brings some sort of bigger investments with it, even if not 1:1 related.

SMT is a fix for this I assume, if Steemit Inc pulls it through, than community tokens and proof of brain will succeed.

That's a really good idea for making it easier for new people to get traction on here when no whales (or even dolphins) are following them to give them bit upvotes. And while in some ways this is a side conversation, since this post is really about scaling the tech, not how to increase the size of the user base, the two are definitely related. So let's play.

So yes, would be good to give new people a chance to more powerfully upvote each other the same way old-timers can powerfully upvote each other. However, there is also something to be said for providing someone who has invested their own money in the platform with some additional stake in things like how rewards get allocated. The same goes for those who have invested many months of their time in posting and growing quality content on the platform. I don't think bandwidth, the ability to post more, is adequate compensation for that investment.

And speaking of investment, we also want STEEM to be an attractive one, even for people who don't intend to post. It may seem like SP wouldn't matter to them at all, but not so, because now they can delegate their SP at an interest rate of about 25-30%. That's a darned good return! In fact, I'm currently developing an advisory relationship with a socially conscious investing firm, and guess which coin I'm going to be recommending at their #1 play! But this only works if other people have a reason to pay those returns to them to buy their delegated SP. And the only reason someone buys it is so that they have bigger upvotes.

The problem is for people who have no money to invest at all and also haven't been here that long, so haven't invested sufficient time either. I don't know that there is a way to equalize for that, or if there even should be. Might the initial SP delegation be bigger? Yes, I think that would be good. I think now the investment of time needed to get any traction is just too long. I also think that delegation should remain longer after the person starts buying small amounts of STEEM they can afford. Right now if you buy only $20 of STEEM you basically just replace some of your delegated amount with that. You get no additional upvote ability. Ouch! I keep trying to explain to people they need to invest at least $100 to get any kind of decent upvote, but not everyone has $100. Do we not want them on STEEM dapps?

So there is definitely room for improvement in how we scale the user base successfully without churn. But I don't think it's as radical as decoupling SP from upvote power.

Hmm🤔, true.

Exactly what I thought @raycoms.

Even if we still get pennies, that would mean that the project is a (sound) success.

Steemit is a big success indeed.
✔️

I disagree about Facebook.
Facebook cannot compete with Steemit.
Facebook is just a waste of time, while Steemit is a community of Supportive people who look out for each other.

I've made more resourceful friends here on Steemit than on Facebook.

Facebook is a lie.❌
Steemit is the real deal.✔️👍

Working on improving the front end is more important than scaling issues right now because it is impossible to find good content on Steemit. Furthermore, Steemit doesn't offer advanced features like notifications compared to busy.org.

so far I study the workings of steem and I have yet to get a fatal error, for now the system works has been very good, I have yet to ever find greatness in any platform other than this here, the team is very wise in making decisions, all users will be very thankful with this notice, even if they do not know about scaling, Hopefully more good again with spaces larger, blockchain, We don't want to lose steemit, keep working to make great changes, If you agree then I will turn this post into indonesian, and I will re-post with give the original source of your postings, to open an insight on my State about the greatness of scalability, AppBase, Steemd, RAM, Full Node, and the Node performance improvements.

Sounds good to me, go for it!

Andrew Levine
Content Director, Steemit


Here's the post I have translate.
Beberapa system penskalaan (Exploring Steem Scalability) yang harus dipelajari sebelum menjadi WitnessHi... I've finished making the posting into my language, I am very grateful to @andrarchy who has given this great opportunity, Hopefully my friend here can add science with this post.

Do I really get permission to edit this post into indonesia?
Wow Thanks a lot...

This is great information and timely. I was just looking for an explanation, best one so far. As a few of the others have said well written for those looking to see what it takes to run the different nodes. Much appreciated.

Wow, interesting

Posted using Partiko Android

the majority of blockchain transactions occurring globally are being done on Steem.

Love it!

Thank you for the update!

Wishing you techie wizards the very best with the Steem scalability solutions, because if mass adoption of Steem is to come and over-spill into every day world transactions, and be truly disruptive, as so many hope for, we will need this fundamental development to carry us forward and bear the weight!

So... it's more important to see this focus, than a lot of the vapid opinion about the nature and design of steemit itself, which has surely already proved itself as the effective platform for free speech we need.

As for decentralisation versus centralisation, I personally am less worried about this aspect. I think it's good to have some level of expert, dedicated control over the protocol, for now.

Best of luck and please keep us updated!

You guys have evolved so much, it is encouraging to see how your team has developed so much in just two years and how also the project, Steem and Steemit has come to be a great amount of applications on top of a powerful blockchain.

It is also great that you detailed the problems that Steem has right now with the RAM and how it should be addressed by witnesses and node owners, as this started to be a problem pointed out by many of them.

Overall you are doing great and really improving your communication with the people here who missed it so much. Keep going!

Thanks for the words of encouragement. We're trying!

Steem has already proven itself as the fastest and most heavily transacted public blockchain in existence, and scalability continues to remain a top focus of ours. We know that scaling challenges will never completely go away, which is why we plan to continue innovating to ensure that whatever growth comes our way - we'll be ready

I love the fact that @steemitblog and it's team are working daily to improve the blockchain and Steemit experience for its Millions of users.

I DON'T NEED A SCALE BECAUSE I HAVE ONE IN MY BATHROOM ALREADY!
YOU'RE NOT MY MOM.

Really i appreciate you.Thanks for your content written.

Thanks for article...

Such an interesting and educative post,thanks for educating us.

good job friends success continues

Wow awesome post. Great article about steemit. Thanks

Yes it is an amazing article, I hope it will help just in case you want to start your own node.
Sarcasm

This is still a very difficult concept for me to grasp.
I will probably have to dedicate a week to research and summarize my findings, that is how I learn best. I am abit tense to dig on this one, however, I know I must be done and comprehensively understood.
I'll be back to reference this.

Xo Thank you for the valuable information @steemitblog

"In other words, the majority of blockchain transactions occurring globally are being done on Steem."

This is one of the best things about STEEM. If only we would have a wider public adoption. :(

Sure but lets fix steemit before it goes even more public, its a shit storm right now.

This has been the argument for two years. We’re not any closer to making this place not a shit storm (and it’s actually worse), nor does it seem that we’re any closer to reaching a more mainstream audience.

So what does this mean? We can handle the largest amount of transactions (we're better than visa, paypal, bla bla bla) but we fail to handle the backend and look up previous transactions. Transfering to the bank is ok, no record necessary, transferring from the bank, well, hang on a while, we're having difficulties checking how much you own.

That's called a bottleneck. As so many others have said: what about the future of the platform? How do you plan on retaining users? What is your future stance on proof-of-brain vs voting services? How do you think on improving the reward system? Or is it considered perfect by now? ... All we get is half baked announcements that are either hard to understand or turn out later to be untrue. In the meantime we see busy.org, utopian and so many others evolve. But STINC... It's getting frustrating.

We intend for full account history to be available again soon on steemit.com and apologize for any inconvenience it has caused. By us putting a large amount of effort into scalability of Steem blockchain services it will make it easier and more possible for other Steem apps to be able to scale as well. The efforts we are making supports the entire Steem ecosystem, not just steemit.com. In addition, a large number of Steem apps actually rely on our infrastructure to run their sites.

We are trying to be as communicative and transparent about Steem development as possible - the community was very vocal about letting us know their desire for open communication from Steemit Inc. - and that's exactly what we're doing here.

If there are any specific scaling concerns that you feel we haven't addressed here, please let us know and we will do our best to address your concerns.

I like the fact that you are pro active like this. I just got on steemit today and I think it's got so much potential now I need to figure out what I want to blog about .

I appreciate your honest response Justin. As a developer of www.steemmakers.com I realize how important managing the growth is. I appreciate the communication but I'm still disappointed more than I am pleased with the communication. I'm sure you are well aware of it. The question remains if it is going to change. This is not the kind of communication we are looking for. Just scroll through the comments to see what's keeping the community busy. I haven't seen STINC take a position on any of the simple questions in my initial comment.

Bitcoin isn't "stuck at a standstill" with scaling. Its called "Lightning" and its on the main-net now.

ETH however, has a much longer road to travel since "Cryptokats" is clogging their blockchain with data.

As far as Steemit is concerned, Delegated Proof-of-Stake will always have scaling problems, no matter how much RAM you throw at it. All you're doing is making sure the end-user node requirements increase to the point where no reasonable machine can sustain it. (Without being located in server racks at a co-location facility.)

Its the same path a lot of failed coins have traveled, and you're going to have to come up with some better solutions than pushing up the requirements.

But who am I kidding, the Reward Pool abuse is rampant, a significant part of the platform is all bots -- so there won't be any future at this rate.

As far as Steemit is concerned, Delegated Proof-of-Stake will always have scaling problems, no matter how much RAM you throw at it. All you're doing is making sure the end-user node requirements increase to the point where no reasonable machine can sustain it. (Without being located in server racks at a co-location facility.)

Can you elaborate on this? What scaling concerns are you referring to? Are you also saying that PoW blockchains don’t have these concerns too?

I think there is a (small!) bit of a valid point here in that nobody really claims PoW blockchains scale well (okay maybe the extreme big blocker faction?). PoW is trying to do something (permissionless system, nodes on non-dedicated hardware, objective global consensus, etc.) that isn't really focused on scaling as a primary design goal.

However, by contrast DPoS is very much built around a premise of scaling and projects (and Graphene in particular) have a history of making extreme claims about scalability (such as 100K TPS or higher, claims of being able to support Reddit-scale in the Steem white paper, etc.) which are often narrowly made to the point of being misleading IMO.

CPU-wise it might be valid to throw around these bold claims, but can you imagine how fast the blockchain would grow and how unmanageable it would quickly (!) become if someone actually tried to do this? (Not to mention other issues such as bandwidth, etc.)

So measured against the claims, I think it is fair to say that DPoS has more scaling 'problems' because its claims and ambitions are much higher.

So measured against the claims, I think it is fair to say that DPoS has more scaling 'problems' because its claims and ambitions are much higher.

I agree with this.

I didn't say that PoW didn't have scaling concerns - I'm just VERY concerned that a RAM allocation tweak is the path that DPoS is going, knowing that Proof-of-Stake has its own unique problems.

But you know what, I don't have to go down that rabbit hole -- if Steemit is still functioning a year from now with a larger user base, then I'll be wrong -- otherwise, I see this as a step down the path to ruin.

I wouldn’t characterize this as a RAM allocation tweak. It is more of a change to not use RAM, and instead use something that is more scalable: SSD.

Unfortunately that won't take things very far. I slightly disagree with the post about 16 GB witness nodes even with fast SSD (and especially 8 GB). I've tried to do this (and I still have one running) and it is very painful in practice. When the memory usage doubles then 32 GB nodes will become likewise painful.

But even if you disagree with this, it is still clear that it is going to become a problem at 3x or 4x or so. SSD is really not that much more scalable than RAM given how things work now, rather only a little (using an actual database has the potential to improve that, but that seems somewhat far away).

But even if you disagree with this, it is still clear that it is going to become a problem at 3x or 4x or so.

I don’t know enough about the inner workings to know whether this is true or not.

I obviously trust your judgment on these types of things (more than mine) but I am also not 100% convinced that it wouldn’t work to have the shared memory file grow to several hundred GB (or more), and still only require 16-32 GB of RAM.

it may work, but it will just be extremely slow, not only full replay but even syncing a few days of blockchain after downtime. This is already the case at 30-something GB state with 16 GB RAM and 2x NVMe RAID0. I'm not really sure what they are thinking when they claim 8 GB is still usable. It certainly isn't for me, I was very seriously considering dumping the 16 GB, but i haven't yet.

One of the main concerns that the post was trying to address was that the requirements for witness and seed nodes had already reached 64 GB of RAM and were on their way to 128 and continuing to grow.

The belief that the state file should be stored in /dev/shm/, and that it was best to either rely on swapping or having enough RAM to hold the entire state file was a large part of this misconception.

You are correct that at some point in the future, if we continue to grow without making any changes, we would reach a point that 16 GB servers, and even eventually 32 GB would not be enough, but many of the changes we are working on (such as AppBase and RocksDB) are intended to address this long before we reach that point.

  1. The info (both in the post and here) about /dev/shm is completely wrong
  2. It is true that 64 GB is not currently required (and certainly not 128 GB which is actually overkill to the point of being useless) for a witness node. 32 GB works fine. 16 GB is pretty questionable and I'm doubtful that 8 GB is even usable (I'm doing some tests right now). 32 GB will be entering into that state once the file exceeds physical memory by a sufficient ratio, which from past experience at various sizes seems to be about 2x (I believe the file is around 35 GB currently). The memory mapped approach just does not do a very good job with delivering good performance when the data size is much larger than memory.
  3. Moving the data to a database should improve the scalability of data size with respect to physical memory dramatically, once development is finished, but may have other tradeoffs. We will have to see how that works out.

@andarchy

The belief that the state file should be stored in /dev/shm/, and that it was best to either rely on swapping or having enough RAM to hold the entire state file was a large part of this misconception.

So essentially the suggestion is, only during the initial indexing the files needs to be read quickly and from that point onwards only the last blocks will need to be in the RAM for faster I/O ?

In otherwords, once the reindexing is over, which is CPU driven and IO driven, the "tail end of the blockchain" is what that gets I/O and rest need not be in memory.

If this is the case, we need to carefully pageout the older parts of the blockchain from the memory and only the new (tail end) needs to be in memory.

Does this make sense ?

Indeed @timcliff you are correct - disk space is relatively inexpensive while RAM is much more so. By using other services to reduce our 'full nodes' to something more like a 'consensus node', we are making it possible to easily and cost effectively scale.

This is an excellent point and a more succinct and clear way of expressing most of what the post was trying to say.

Oh no? Then this statement:

Nodes that are running additional API plugins (especially account history) will require more RAM to support a larger state file.

Certainly isn't congruent at all.

Here's the thing, you're taking one strategy "Put it all in RAM" and substituting "Put it all on SSD and page INTO memory", which doesn't address why you need to do the above in the first place.

The reason you have to split things into "modules" and try to even out the load is that your full nodes using delegated-proof-of-stake will not scale any further without "further optimizations" which right now consists of "offload the stress to cheaper/slower IO device".

Its very similar to the strategies that Ethereum is trying to solve, because their blockchain is bloating way too fast (lots of blocks filled with all kinds of junk, like cryptocats - the faddish pokemon ripoff).

All I'm seeing here is a fire-drill response that will result in short-term relief, but hasn't addressed the fundamental problems that exist.

Its okay, with the retention metrics being consistently crap - I think the only demand you are feeding is that of the bot armies that are diligently sucking the Reward Pool dry.

Its a bit like upgrading your email server because you have a lot of spammers hammering it. It doesn't help anything, because the root problem hasn't been solved.

Sorry, but I think you are misunderstanding some of the technical aspects here. There are two different things being discussed.

If you are talking about DPoS - then you are talking about consensus nodes. These are the nodes that keep the blockchain state updated, and ensure that all of the new blocks are 'valid'. These nodes are covered in the "witness and seed node" section. Everything that is required for the DPoS portion of Steem to run is contained in these nodes, and the post was very clear that these nodes do not require the state file to be stored in RAM.

The part that you are quoting is talking about "full nodes" which get into API calls. API calls are an application layer built on top of the blockchain consensus rules. These nodes require more RAM because of the way the code is currently implemented, but eventually (as stated in the post), this logic for all of the non-consensus API methods will be handled separately - through things like HiveMind, SBDS, and RocksDB.

If your concern is rewards pool abuse and spam - those are valid concerns, but they are not going to be resolved in the context of "addressing the scaling requirements of the blockchain". Fixing spam and abuse issues might slow down the growth - but the ever increasing growth of the blockchain problem will still always be there - so scaling is something that still needs to be addressed regardless of what is/isn't done to address spam and abuse.

It seems we've reached the TTL for this convo, which is fine.

See you in a year, if this is even still around. Then we'll talk.

Finally, after 41 days - @ned upvoted again.. ;)

But on a serious note - great job @andrarchy (my guess is that you've written that one) for keeping us updated and to the whole Steemit-Team (esp. Devs) for your passion and work.

Really excited for the future!

Thanks, it's definitely a team effort as many of these solutions involve different projects and different people. Many of our devs took time out of their busy schedules to contribute.

Keep up the good work !

Very good explanation, the cryptocurrency came to stay for along time. It's a good time to know how to take advantage of it!

There are 3 pages
Pages