AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |
Back to Blog
Amazon redshift spectrum vs athena8/17/2023 You do not need to load data into Athena, you can use Athena to query your data on S3. It is simple to use since you can analyze data using standard SQL. Read about BryteFlow for AWS ETL AWS AthenaĪWS Athena is a fast, serverless, interactive query service. Before we go into details, here is a quick rundown about both of them. Both are part of the AWS environment so it is quite natural to be a bit confused about which one you should use. I think the picture below (from ) is fairly clear that compute nodes are influential here, and perhaps contrary to valuable insights above.Amazon Athena and Redshift Spectrum are both AWS services that can run queries on Amazon S3 data. I can't believe Spectrum is deprecated so must offer this answer to contest this. Why would Amazon offer a serverless product in Athena that outperforms Redshift Spectrum which is more expensive? This is how they are choosing to deprecate RRS. This made me suspect RA3 might be edging Spectrum out that and the lack of decent literature on Spectrum. And consistent performance, if I am to believe Adrian Cantrill more than Jon Scott. Why would you use your own estate to perform the queries that Athena would do without such an investment from you? Caching, where it fits. So I say Spectrum is most suited to where we have long term Redshift clusters that, being OLAP nodes, have spare capacity to query S3. The recent RA3 instances seem to overlap this niche though. I (again, based solely on my hands-off research) would choose Spectrum when the majority of my data is in S3, which would typically be for the larger data sets. The rest of that answer is good and I do not mean to directly copy any of that here (without references it hadn't registered with me when I wrote this). I wrote this answer because I wasn't satisfied with the leading answer's treatment of Athena outperforming Redshift Spectrum. I appreciate this information might only be useful for the exam, I didn't find his argument convincing. I had learned (from Adrian Cantril's/LA's 2019 SA Pro course) that Redshift Spectrum would use one's own Redshift cluster to provide more consistent performance than is available by leveraging the shared capacity which AWS makes available to Athena queries. But it has still a long way to go to be mature.
0 Comments
Read More
Leave a Reply. |