繁依Fanyi0

这个屌丝很懒，什么也没留下！

热门标签

How Rocket Companies run their data science platform on AWS

作者：繁依Fanyi0 | 2024-03-30 07:18:47

踩

Hello, good morning. Welcome to re:Invent 2023. Thank you for taking this time on Monday morning to come over and listen to us about how to harvest your data with the help of analytics and machine learning. Customers like you tell us that data is critical. With the storage cost going down day by day, it's very easy to collect a lot of data points for your processes. Where you face challenges is how to harvest that data, how to make use of it for your business case, how to capitalize it for your new use cases.

I'm Ravindra Gupta. I'm the worldwide go-to-market lead for machine learning. And today we're gonna talk about how you can leverage our analytics and machine learning services, which are packaged together as best practices, to harvest your data.

We have hundreds and thousands of customers who are taking advantage of these services. And the good thing is, what better way to learn what those best practices could result in for you than from one of your own customers?

So our customer Dan Xue, she's from Rocket Companies. She's gonna share her experience about what problem they faced with their data or their infrastructure, how they capitalized using machine learning and analytics, and what impact it has created.

So please help me welcome Dan Xue to take the stage and talk about Rocket Companies' story about harvesting data.

What up doe Vegas? 830, you made it? Give yourselves a big round of applause. My name is Dan Xue from Rocket Companies. I've been with the company for 10 years. I tend to call myself the mother of the modernized data platform. What I did is build an enterprise data lake, data warehouse. We continue thriving in the era of Gen AI. Who's ready? Who today? We are going to show you guys a journey - what are some pain points, how do we complete a full modernization of our data science platform?

I am committed to bring you energy, to bring you some insights, inspiration to carry on for the day. Are you guys ready? How many of you guys have heard of Rocket Companies? Rocket Mortgage or Quicken Loans? Nice. I see a lot of hands showing. To get it started, I would like to ask each and every one of you, please take a deep breath and think - what is your American dream? What is your definition of the American dream?

Bingo! Love it. And you need a house for retirement too. I came to this country 11 years ago. My American dream is exactly as yours - getting a beautiful house, two floors, nice basement, and a decent yard where my kiddos can run wild. By show of hand, how many of you share similar American dreams? I'm seeing a lot of you.

The reason that shows is despite our American dream and passion, at least 40% of people in America feel buying a home is the most stressful thing in modern life. We want to change that. Our founder, Mr. Dan Gilbert, had a dream. His dream was to drastically simplify the mortgage process to help everyone fulfill their American dream of home ownership.

We started with innovation at Rocket. We have a strong series of cultural principles we call the ISMs. One of my favorites is "You will see it when you believe it." We need to believe in innovation. And in 1996, Mr. Dan Gilbert figured out how instead of sitting in the branch trying to fill out a daunting list of paperwork, why don't we just deliver the entire mortgage process in a pizza box? And he did it within two months.

We're seeing accomplishment of 35 million closed loans in volume. And in 1998, when the internet became popular, we committed our entire resources to take this whole process to the internet. This is a more relevant time than ever because in the era of Gen AI today, it's exactly similar as in the nineties when the internet came out - what do we choose to do as a business?

Fast forward to 2015, we were the first to take and deploy an end-to-end full online mortgage process and launched Rocket Mortgage. We did not stop there because we quickly realized that other than the mortgage process, our clients have more complex moments in their life. They need more services.

Yet we grew from Quicken Loans into a fleet of financial services, bringing a big variety of services to our clients. You need personal loans? We founded Rocket Loans. You need real estate agent help? We have Rocket Homes. If you need to manage your finances and wellness, we recently acquired TrueBill and rebranded it to be Rocket Money.

Not only do we want to provide you the financial services you need, we also give back to the community. One way is we have the day we take responsibility. We also have the Gilbert Family Foundation with the mission to cure neurofibromatosis. We just recently raised $55 million last week in one night. We also want to end homelessness in Detroit.

We give back. How many of you came from the industry of financial services? About 30% of the room. What is our biggest challenge in the market? This is a chart of interest rates. It took us almost 20 years for the interest rate to drop from above 7% to under 3%. And yet in the most recent 18 months, it rose back up. Businesses are suffering all across the industry. Some businesses closed.

What did Rocket choose to do? We've been around the industry for 38 years. We've seen crises and we went through them. We love challenges. We love changes. Every time there's a crisis, it's our time to grow and shine because we have a strong business strategy to focus on the most important part - our clients. What do they need?

We are going through this journey together. We decided to grow our client base and continue to provide more variety of financial services to help our clients. And all of this will not happen without the power of data.

Talking a little bit more about data and analytics at Rocket Companies, I want to share a little story. I started at Rocket Companies 10 years ago as an intern. A couple weeks in, one day I was sitting on the floor and our founder Mr. Dan Gilbert walked in, and there he goes - the legendary Warren Buffett along with him!

It was a beautiful moment where Dan was showing Warren our full end-to-end self-service business analytics that we call Mission Control. Where we give in-depth visibility into where clients are at in each and every step of the mortgage process. That moment was so intriguing, it planted the passion in me that made me dedicate my past decade at Rocket focusing on data.

In the past over 15 years, we had deployed hundreds and thousands of self-service analytical solutions in every corner of our business. It's so critical. Also we want to continue to grow our digital landscape - when a client walks into any of our website assets, we want to understand who is our client, what are they looking for so that we can deliver a real robust personalized experience.

Also, we spend a lot of money in marketing - how could we optimize that? How could we ensure our hundreds of millions of marketing spend is optimized, especially when the economy is giving challenges? We leverage the power of data science. We were able to develop things like Hatch models to protect our dozens of billions of capital market bulk.

We also dedicated to full automation - continue striving to make the mortgage process more automated. And none of that would have happened without the power of big data.

Last but not least, one of the most critical things for us as financial services is how much we care about protecting our client's sensitive data. We have a lot of compliance, we need to meet requirements such as GLBA - Gramm–Leach–Bliley Act - about data retention, encryption, data classification. All of those require a powerful data platform.

We were early adopters of the data lake. We started our first generation of the data lake in 2017 using an old legacy version of Hadoop. We did leverage Spark, by data engineering and data science we use Hive for querying mainly for analysts. We initially had all of our data in HDFS, which was working at a point in time when we deployed our first data lake.

But quickly we learned it cannot scale and it's not as reliable. It's 10 times more expensive than S3. The data lake worked very well and powered dozens of data science initiatives, but year after year, when 2019 and 2020 the mortgage rate drastically dropped, we are seeing our volume double and at times even triple, we quickly realized this legacy structure cannot scale.

And a few factors, a few symptoms: one is data injection simply took too long with this legacy stack to ingest a new source into our data lake - it took a matter of weeks, 4 to 8 weeks. It was not acceptable at that point in time. Also it could not scale in the sense that the legacy stack is harder and harder to support.

Anyone here work or support like a support group or understand the challenge of on-call? I see a few hands - we do not want our engineers to wake up at 3am and have to troubleshoot. Yet our legacy stack quickly showed a lot of challenges in support.

We must get out of this. We must modernize our data platform. How to do that exactly? As I shared, step one: you will see it when you believe it. We have passion, we need to bring our team members' commitment. How do we do that?

I actually have a video we created at the very beginning of our migration journey to share with you, where I made the lyrics and the song. Our team made this migration song. Please enjoy.

Thank you so much. That video meant a lot to me. The guy dancing and playing ukulele, Mr. Shannon Hall, was one of the best engineers who had been with Rocket for 11 years. He just passed away two months ago due to cancer. But it was passion from folks like him and a lot of team members, and our commitment, that started this journey and we finished it now with the modernized data science platform.

What had we accomplished? The data injection now takes a matter of minutes. We were able to deploy a full configuration driven Lambda pattern that's applicable to dozens of different data sources.

How about support? No one needs to wake up at 3am firefighting anymore. Once we adopted the modernized data platform, I barely recall maybe 1 to 2 instances in the past 18 months. Life is beautiful!

How many folks here are data scientists? Yeah, I'm seeing a few now. One of the key challenges for data scientists is you might be spending 60% of your time looking for data. And with our legacy stack, not only were they waiting for resources to free up, it could not scale.

Now with a modernized data science platform, we can easily scale up and down. And without this investment, we would not have been able to power 3.7 billion automated AI and data science decisions.

Today, we all know we want a platform that's better, faster and cheaper. How about the cost in our legacy stack? We went with a couple vendors, we were committed with over a million in fixed costs. When we tried to reduce, it was ridiculous - the vendor was telling us "Hey, your version is so outdated. Not only do you need to pay hundreds of thousands of dollars, but you need to sign that we are not really providing you any support."

And now with the partnership from AWS, we were able to adopt the pay-as-you-go model. So there's zero fixed cost. And also one highly recommended tool is called MAP - Migration Acceleration Program. If any of you consider using AWS to modernize your data platform, that's the best one to consider - talk with your account team today because we were able to save $3 million in getting credit back. Who doesn't like credit?

How did we do that technically? This is our modernized architecture. By having our data in S3 with multiple layers, we were able to make it cheaper and accomplish sustainability. We like Lake Formation - it helped us seamlessly tie all the services together and we are able to manage security to meet our enterprise standard.

One of our favorite services is Glue Catalog, especially at Rocket Companies. We have multiple companies in the family of companies. We need to be able to provide data sharing capabilities and also meet legal compliance requirements. What's great about Glue Catalog is it reduces the amount of data replication we need to do. All you need to do is share the catalog across different accounts. It's very sweet.

As I mentioned, we leverage Lambda and we have a full configuration driven injection pattern. We now have hundreds of Glue ETLs as our regular ETL tools for data processing. We leverage the power of EMR mainly for our data domains and providing and building the data quality data product.

We are able to leverage Redshift as our cloud version of data warehousing, taking our decade of hard work from on-prem SQL Server to the cloud, continuing to grow the Redshift data warehouse.

We also love SageMaker - that's where data scientists really enjoy the powerful capability that's up-to-date. Diving in to simplify all of that to three simple steps:

Get the data into your data platform at Rocket...

"We currently ingest our data from over 150 different sources. A combination of 1st and 3rd party data. What I described the lambda based approach is only one of the options you can work with your solution architect and accounting to figure out what's best for you. But for us, we need to be able to handle data volume in petabytes. We need to handle a variety of different data sources and the nature of it. And the most important thing is it has to out of scale.

By the way, I'm bringing a demo of auto scale. Today, I'm wearing this beautiful piece of Indian sari, which I believe is the most auto skilled solution. I can wear it right now. If I'm like eight months pregnant, I just need to wrap it differently. Beautiful auto scale.

Also, once the data gets into the data lake, what we do is process in different layer to really bring data governance, bring meaning to the data making the conformed layer, then build powerful data domains and quality data product. Then we serve the data allow different business use cases and initiative to connect with our quality data product via API s.

Why do we choose emr when we talk about modernized data platform? One of the top concern is how big is the lift? We don't want to have to redo the work by modernizing doing parallel migration of your hundreds of pipeline. One thing very sweet about emr is the backward compatibilities. We were able to simply do a lift and shift and get our spark workload into emr with near to no work. The only thing we had to do is curb our cluster and we are ready to go. Also it's very cost efficient. One thing to highlight is the spot instances, it's not just a tag line can save you up to 90%. It does save money. Especially when you have workflows that's more flexible. We were able to overall accomplish a 40% reduction by going to emr without cost.

Our most favorite feature of emr is transient emr. I was joking with my partner a lot here if i think consider the traditional emr as me buying a bunch of sari and put in my closet. A transient emr is you leverage uh rent the runway salary rental services whenever you need it. However, you need, we love the power of emr because it give us flexibility and save the efforts. Our infrastructure team need to put into conflict emr clusters. Just by looking at this life cycle, we leverage eventbridge. We set up rules as a trigger. It's just why i put an order in. Hey, there's a big event coming. I need a s that look exactly like this ready to go and then we leverage lata to lunch and boom is there? We love it.

Now, step two, you have powerful quality data domain and data product to power your analytics and data science initiative. The last step unleash your data science. In our legacy version of data science platform, we had provided the most painful experience as we could not scale. But we are very glad we went with sage maker. A few things. We were able to provide a lot of of the flexibility in deployment because we need our models to be able to be deployed into multiple and a variety of different hosting environment. We also want to empower more rows for business analyst, data analyst, not everyone has very neces necessary like different skills in a programming languages and spark we really want to pursue the no code and low code to really leverage the strategic mindset from these rows without getting bottlenecked by the skill.

Why sage maker, the number one thing as i mentioned earlier, how key is security as financial services? We really appreciate sage maker. And the way we do data science is we are able to keep all of our data sensitive data in the backbone of aws that helped us to be able to meet the enterprise level security at scale. Just like emr sage maker also provide wonderful backward compatibility which makes our spark workflow that data scientists leverage easily, be able to get a lift and shift into the cloud. It also like effortlessly integrated with other services we just talked about which reduced a lot of efforts and extra work.

One thing we really like about s stage maker is the simple pane of glass, the user experience. As a data scientist, you may have a simple skl model that doesn't need a lot of horsepower. You just need like a virtual machine, some basic cp us that will work. But on the other hand, you might have a tensor flow workload that require the superpower for gp us sage maker studio were able to provide a very friendly interface to have you manage this variety of workloads and have dedicated resources matching exactly what you need. And with that, we're able to save a lot of cost because how flexible it is and how is customized to the requirement of your workflow. And also, as i mentioned, we are able to empower more roles in innovation.

One thing at rocket we does is every quarter, we do a hack week. It's our innovation week where we give an entire week back to our technology. You can do anything you want that you believe is either for your own growth or for the company. And those weeks are the most innovative times. A lot of the business transforming idea coming from those weeks, the low to no code empowers and feels and accelerate those hack week outcome.

We talked about emrs maker. Now what's next? We want to continue to grow and accelerate our data governance as we keep grow our data landscape. It's exponentially challenging to manage and govern our data. We are exploring the amazon data zone. I'm sure this week you guys will hear a lot of great feature and exciting announcement. One thing very sweet that i'm so glad we invested this modernized data science platform. We either think about it as an additional benefit, but now we are definitely loving it is we are ready to go for a gen a i with a fully modernized data science platform. We were able to be an early adapter. Join an early preview for bedrock very quickly. Months ago, we were able to provide access to 32 ll ms to our own data scientist analyst. It's amazing. The innovation just got accelerated and took wings.

We want to further unleash the enterprise data science. We want to encourage our business partners. If you have a great idea in this era of gen a i, you were not being short of tools but you need passion, you need commitment, you will believe it when you, you see when you believe it a couple of highlight of our gen a i use case. One thing is understanding the challenging the of data governance. We need to be able to manage our massive amount of metadata scale. Think about we have 1 million data elements, how to manage and monitor those data definition, business definition. This is one of our gen a i use case that we believe by leverage the power of that rock and gen a i. We anticipate a reduction of 80% of manual work.

Not only data governance, we also want to help accelerate our data engineering. We're leveraging and evaluating code whisper and we want to put the powerful tools to unleash xiao engineer and data scientists to really be able to help take our business transform the way we do mortgage, transform the way we do financial business to the next level.

Next, i would like to pass it to my wonderful partner, revenger to talk about what are some exciting upcoming features?

Thank you, wayne. Thank you very much. This is an awesome, awesome presentation and a customer experience. Think about 4 to 8 weeks of delivery time frame, cut down to less than two hours, think about 180 plus production issues to cut down to zero. That talks about the platform stability in a scale to wrap it up. I just wanted to make sure that you understand what are the components which helps you to capture your data journey.

So the first thing what we did is we have built out the platform for different personas across your organization. You could have a case where the domain expertise lies with the business analyst or the domain expert. They don't necessarily have to know the code of big data plate form or data science capabilities. But what they have is to understanding their business process. So we build out a simple click drag and drop plate. What we call is a sage maker canvas. It's end to end plate. You bring your raw data to s3, get going with the s3 with the sage maker canvas to capture those data build train deploy model without understanding the nitty gritty of data science or big data.

Then we have customers which are asking that a very simplified ide for for machine learning and analytics. That's where we have built out sage maker studio which gives you this complete view of data to data harvesting, to building out that machine learning model, experimenting it and finally deploying into production with ease. And these are all intuitive workflow which helps you to take to this journey which is harvesting your data points.

And finally, an ml ingenius, the machine learning is so critical to the business that in future every business process is going to embed with the machine learning model. So think about today which is 2 1020 or 30 models tomorrow becoming 600,000, 2000 models. How are you gonna manage them in production? You require that dev ops kind of discipline in order to manage those models at a scale. And that's where we have built out an m capability in the platform itself.

So from sourcing the data cleansing it, cleaning that golden copy, handing over to data scientist where he's exploring those data and building out those models, all step can be stitched together these pipelines where you can build out this end to end ml ops capability to govern it and making sure that it is meeting your scale requirement for your organization.

So what our data scientists would get out of it. So the platform provides some of the core capabilities. You could have an emr cluster or a big data cluster on a spark or hive running on, on our infrastructure and you want to capture it through the sage maker studio. So we have that capability where you can see all the clusters which are running into your account and you can capture them to build out the machine learning models. Not only that if you need to create a new cluster, it's a click away. You can go and you from the studio itself, you can click and talk about what kind of cluster you want to create it, provision, it, get it ready and start your machine learning journey from a data scientist perspective, you know, there's a lot of organizations want to put a discipline around what kind of cluster you can create because you know, cluster comes with a cost and that's where we have a capability in the platform itself to help you to tempt the cluster creation process.

So your administrator who is controlling cost, who wants to make sure that you know the cluster has been authorized to a certain individuals, they can do. So with the help of our platform, you can create that cloud formation on terra form uh scripts, hand it over to uh to a see maker studio. It's gonna show to the customer, you click for example and then you create the cust uh the cluster accordingly. All these clusters comes with a fine grain control and there are capabilities which can help you further enhance your data science experience.

So for example, local mode, so there's a lot of cases where data sciences, science wants to do the processing where their environment or the studio is running without sending back to the cluster. With the help of local mode, you can clean those data right there without necessarily giving back and forth with your big data cluster to perform that analysis. The platform also brings the capability in case, like for example, you have your own customer script, you have optimize you know spark cluster for yourself. That's what something you want to build it. So bring your own image can provide you that capability where you are kind of like working with your own data sets and with your own custom images to optimize your data science experience.

Again, quite a lot of capability has been built in. And finally, you can automate this emr aws glue in the pipeline to create this mps experience to make sure that next iteration is going through a very seamless and less hands offs are happening between the.

Finally, there is a number of organizations they asked that we have a separate analytics teams who wants to focus primarily on data explorations or data analysis. So we have built out emr serverless, which has been there for a long time, but there's a lot of new capabilities has been came in to help your experience around that em r serve.

So first thing is graviton two support. So this year we have a graviton two which is up to 40% cheaper than x 86 chip sets for the fifth generation. So think about your workload which is running on now, graviton two and kind of cost saving. You can bring to your organizations with a simple click of selections of graviton two images.

Similarly, number of customers ask us to job level cost. So when you push this content onto a serverless environment, one of the challenge our customer ask is that how do i calculate cost of each job? And that's where we have built out this job level cost analysis where you can see each job, how much cpus, how much you know memories it has consumed to calculate that cost for per job. Again, quite impressive capability there.

Finally, with the studio, you can interact with your glue or spark cluster which is running on emr without worrying about how to scale. What size do you need beforehand before you submit your data. So again, quite a lot of capability from your analytics team perspective in order to take advantage of em r serve for their workload.

And finally, this ops so ms capability is critical for organizations to scale when we talk about, you know, that talk about, you know, eight weeks of delivery time frame, cutting down to two hours, it does not happen in vacuum. It requires very strict ops guidelines to implement it into the process to make sure that changes are happening in a controlled fashion. And that's where we build out this mps capability which helps you to stitch this end to end data process, to control and production eye in a much faster clip, you can catalog your versions, you can look at what data you have started. What was the experiment with, with those data was performed from your model? And all those version traceability gives you the lineage which is required for your compliance and regulatory commitment. Again, this is uh this is something which has helped a number of customers to scale their machine learning.

And finally, we talked about canvas, which is our no code local tool. The beauty of this tool is that you can now work independently and democratize machine learning analytics across the organization. It's not limited to very skilled folks and what it helps is you to build out that first draft model. So we have customers, for example, they are using canvas to build out their simplest or easy models, but also creating a draft for very very complex use cases and handing over to data scientists team to do the further fine tuning and deployment. So think about now in your business case where your business user is creating the model for your data scientists, the and with a click sharing to their data science that this is what exactly what i want. Instead of drafting, you know, a word doc of a requirement and going through multiple phases to understand what the final scope of the product should look like. Very, very powerful for helping your entire organizations to embark their journey of machine learning and analytics and this is how you achieve that.

So when we talk about end to end data strategy, we talked about now the data cost is going down or the storage is cheap. What it means is that once you start collecting these many many data points like dan mentioned rocket companies does 150 plus data sources. Once you have those data sources and stored in this purpose, databases, you can create this act layer seamlessly across the organizations with the help of red shift em r studio and transform the organizations from machine learning. We have hundreds and thousands of customers taking advantage of this best practices and we definitely want you to take those best practices for your organizations as well. So happy to connect and ask answer any questions you might have.

And finally, you have completed today, this is the first session of the superhero for analytics. So if you scan this bar code is gonna tell you what other options you have, which is gonna guide you for your skill development in this next five days at treatment. Thank you. And we appreciate to enjoy the re invent. It's a great time to learn, to be curious to share your experience with your peers and connect with our partners. So anything which we can help from a q and a perspective we have i believe is still 19 minutes so we can answer any specific questions you might have."

声明：本文内容由网友自发贡献，版权归原作者所有，本站不承担相应法律责任。如您发现有侵权的内容，请联系我们。转载请注明出处：【wpsshop博客】