FINRA partnered with AWS product teams to leverage Amazon EMR and Amazon S3 extensively to build an advanced analytics solution. In this session, you’ll hear how FINRA implemented a data lake on S3 to provide a single source for their big data analytics platform. FINRA ingests 75 billion records each day of stock market transactions, with an AWS storage footprint of 20 petabytes across S3 and Amazon Glacier. To deal with this workload, FINRA has architected a platform that separates storage from compute to manage capacity for each independently, leading to improved performance and cost effectiveness. You’ll also learn how FINRA was able to leverage Hbase on Amazon EMR to achieve significant benefits over running Hbase on a fixed capacity cluster. FINRA was able to implement a system that seamlessly scales in response to data growth and can scale quickly in response to user traffic. By working with multiple clusters, FINRA can now isolate ETL and user query workloads and has achieved rapid, built-in disaster recovery capability by leveraging data storage on S3 to run from multiple AZs and across regions.
Published on December 2, 2016 by Amazon Web Services