Data Science on the Google Cloud Platform. 2nd Edition
- Autor:
- Valliappa Lakshmanan
- Ocena:
- Bądź pierwszym, który oceni tę książkę
- Stron:
- 462
- Dostępne formaty:
-
ePubMobi
Opis ebooka: Data Science on the Google Cloud Platform. 2nd Edition
Learn how easy it is to apply sophisticated statistical and machine learning methods to real-world problems when you build using Google Cloud Platform (GCP). This hands-on guide shows data engineers and data scientists how to implement an end-to-end data pipeline with cloud native tools on GCP.
Throughout this updated second edition, you'll work through a sample business decision by employing a variety of data science approaches. Follow along by building a data pipeline in your own project on GCP, and discover how to solve data science problems in a transformative and more collaborative way.
You'll learn how to:
- Employ best practices in building highly scalable data and ML pipelines on Google Cloud
- Automate and schedule data ingest using Cloud Run
- Create and populate a dashboard in Data Studio
- Build a real-time analytics pipeline using Pub/Sub, Dataflow, and BigQuery
- Conduct interactive data exploration with BigQuery
- Create a Bayesian model with Spark on Cloud Dataproc
- Forecast time series and do anomaly detection with BigQuery ML
- Aggregate within time windows with Dataflow
- Train explainable machine learning models with Vertex AI
- Operationalize ML with Vertex AI Pipelines
Wybrane bestsellery
-
Ta książka będzie świetnym uzupełnieniem wiedzy o Flutterze i Darcie, sprawdzi się również jako wsparcie podczas rozwiązywania konkretnych problemów. Znalazło się tu ponad sto receptur, dzięki którym poznasz tajniki pisania efektywnego kodu, korzystania z narzędzi udostępnianych przez framework F...
Flutter i Dart. Receptury. Tworzenie chmurowych aplikacji full stack Flutter i Dart. Receptury. Tworzenie chmurowych aplikacji full stack
(41.40 zł najniższa cena z 30 dni)48.30 zł
69.00 zł(-30%) -
Czy chcesz szybko i skutecznie opanować podstawy Microsoft Azure, zrozumieć jego architekturę i możliwości? W takim razie to książka dla Ciebie! Czy chcesz nauczyć się, jak wdrażać, zarządzać i skalować aplikacje w chmurze Azure, nie tracąc przy tym cennego czasu? W takim razie to książka dla C...
Azure w 1 dzień. Microsoft Azure od podstaw po zaawansowane techniki Azure w 1 dzień. Microsoft Azure od podstaw po zaawansowane techniki
-
To trzecie wydanie przewodnika autorstwa twórców Kubernetesa. Zostało starannie zaktualizowane i wzbogacone o tak ważne zagadnienia jak bezpieczeństwo, dostęp do Kubernetesa za pomocą kodu napisanego w różnych językach programowania czy tworzenie aplikacji wieloklastrowych. Dzięki książce poznasz...
Kubernetes. Tworzenie niezawodnych systemów rozproszonych. Wydanie III Kubernetes. Tworzenie niezawodnych systemów rozproszonych. Wydanie III
(41.40 zł najniższa cena z 30 dni)48.30 zł
69.00 zł(-30%) -
Unlock the power of Azure data engineering with this certification guide, elevating your skills in data processing, storage, and security with the help of practical insights, hands-on exercises, and the latest advancements.
Azure Data Engineer Associate Certification Guide. Ace the DP-203 exam with advanced data engineering skills - Second Edition Azure Data Engineer Associate Certification Guide. Ace the DP-203 exam with advanced data engineering skills - Second Edition
(121.68 zł najniższa cena z 30 dni) -
Become a Prometheus master with this guide that takes you from the fundamentals to advanced deployment in no time. Equipped with practical knowledge of Prometheus and its ecosystem, you’ll learn when, why, and how to scale it to meet your needs.
Mastering Prometheus. Gain expert tips to monitoring your infrastructure, applications, and services Mastering Prometheus. Gain expert tips to monitoring your infrastructure, applications, and services
-
This Google Cloud Digital Leader Certification guide is your gateway to the latest cloud technologies as it equips toy with industry knowledge, foundational tech insights, and real-world use cases for a strong start in your learning journey.
Google Cloud Digital Leader Certification Guide. A comprehensive study guide to Google Cloud concepts and technologies Google Cloud Digital Leader Certification Guide. A comprehensive study guide to Google Cloud concepts and technologies
Valliappa Lakshmanan - pozostałe książki
-
All cloud architects need to know how to build data platforms that enable businesses to make data-driven decisions and deliver enterprise-wide intelligence in a fast and efficient way. This handbook shows you how to design, build, and modernize cloud native data and machine learning platforms us...(203.15 zł najniższa cena z 30 dni)
211.65 zł
249.00 zł(-15%) -
This practical book shows you how to employ machine learning models to extract information from images. ML engineers and data scientists will learn how to solve a variety of image problems including classification, object detection, autoencoders, image generation, counting, and captioning with pr...(245.65 zł najniższa cena z 30 dni)
245.65 zł
289.00 zł(-15%) -
Rozwiązania typowych problemów dotyczących przygotowania danych, konstruowania modeli i MLOps Wzorce projektowe opisane w tej książce obejmują najlepsze praktyki i rozwiązania powtarzalnych problemów w uczeniu maszynowym. Autorzy, troje inżynierów z firmy Google, skatalogo...
Wzorce projektowe uczenia maszynowego. Rozwiązania typowych problemów dotyczących przygotowania danych, konstruowania modeli i MLOps Wzorce projektowe uczenia maszynowego. Rozwiązania typowych problemów dotyczących przygotowania danych, konstruowania modeli i MLOps
(80.99 zł najniższa cena z 30 dni)80.99 zł
89.99 zł(-10%) -
As you move data to the cloud, you need to consider a comprehensive approach to data governance, along with well-defined and agreed-upon policies to ensure your organization meets compliance requirements. Data governance incorporates the ways people, processes, and technology work together to ens...(211.65 zł najniższa cena z 30 dni)
220.15 zł
259.00 zł(-15%) -
The design patterns in this book capture best practices and solutions to recurring problems in machine learning. The authors, three Google engineers, catalog proven methods to help data scientists tackle common problems throughout the ML process. These design patterns codify the experience of hun...(186.15 zł najniższa cena z 30 dni)
186.15 zł
219.00 zł(-15%) -
Work with petabyte-scale datasets while building a collaborative, agile workplace in the process. This practical book is the canonical reference to Google BigQuery, the query engine that lets you conduct interactive analysis of large datasets. BigQuery enables enterprises to efficiently store, qu...
Google BigQuery: The Definitive Guide. Data Warehousing, Analytics, and Machine Learning at Scale Google BigQuery: The Definitive Guide. Data Warehousing, Analytics, and Machine Learning at Scale
(186.15 zł najniższa cena z 30 dni)186.15 zł
219.00 zł(-15%)
Ebooka "Data Science on the Google Cloud Platform. 2nd Edition" przeczytasz na:
-
czytnikach Inkbook, Kindle, Pocketbook, Onyx Boox i innych
-
systemach Windows, MacOS i innych
-
systemach Windows, Android, iOS, HarmonyOS
-
na dowolnych urządzeniach i aplikacjach obsługujących formaty: PDF, EPub, Mobi
Masz pytania? Zajrzyj do zakładki Pomoc »
Audiobooka "Data Science on the Google Cloud Platform. 2nd Edition" posłuchasz:
-
w aplikacji Ebookpoint na Android, iOS, HarmonyOs
-
na systemach Windows, MacOS i innych
-
na dowolnych urządzeniach i aplikacjach obsługujących format MP3 (pliki spakowane w ZIP)
Masz pytania? Zajrzyj do zakładki Pomoc »
Kurs Video "Data Science on the Google Cloud Platform. 2nd Edition" zobaczysz:
-
w aplikacjach Ebookpoint i Videopoint na Android, iOS, HarmonyOs
-
na systemach Windows, MacOS i innych z dostępem do najnowszej wersji Twojej przeglądarki internetowej
Szczegóły ebooka
- ISBN Ebooka:
- 978-10-981-1891-4, 9781098118914
- Data wydania ebooka:
- 2022-03-29 Data wydania ebooka często jest dniem wprowadzenia tytułu do sprzedaży i może nie być równoznaczna z datą wydania książki papierowej. Dodatkowe informacje możesz znaleźć w darmowym fragmencie. Jeśli masz wątpliwości skontaktuj się z nami sklep@ebookpoint.pl.
- Język publikacji:
- angielski
- Rozmiar pliku ePub:
- 11.4MB
- Rozmiar pliku Mobi:
- 21.0MB
Spis treści ebooka
- Preface
- Who This Book Is For
- Conventions Used in This Book
- Using Code Examples
- OReilly Online Learning
- How to Contact Us
- Acknowledgments
- 1. Making Better Decisions Based on Data
- Many Similar Decisions
- The Role of Data Scientists
- Scrappy Environment
- Full Stack Cloud Data Scientists
- Collaboration
- Best Practices
- Simple to Complex Solutions
- Cloud Computing
- Serverless
- A Probabilistic Decision
- Probabilistic Approach
- Probability Density Function
- Cumulative Distribution Function
- Choices Made
- Choosing Cloud
- Not a Reference Book
- Getting Started with the Code
- Agile Architecture for Data Science on Google Cloud
- What Is Agile Architecture?
- No-Code, Low-Code
- Use Managed Services
- Summary
- Suggested Resources
- 2. Ingesting Data into the Cloud
- Airline On-Time Performance Data
- Knowability
- Causality
- TrainingServing Skew
- Downloading Data
- Hub-and-Spoke Architecture
- Dataset Fields
- Airline On-Time Performance Data
- Separation of Compute and Storage
- Scaling Up
- Scaling Out with Sharded Data
- Scaling Out with Data-in-Place
- Ingesting Data
- Reverse Engineering a Web Form
- Dataset Download
- Exploration and Cleanup
- Uploading Data to Google Cloud Storage
- Loading Data into Google BigQuery
- Advantages of a Serverless Columnar Database
- Staging on Cloud Storage
- Access Control
- Ingesting CSV Files
- Partitioning
- Scheduling Monthly Downloads
- Ingesting in Python
- Cloud Run
- Securing Cloud Run
- Deploying and Invoking Cloud Run
- Scheduling Cloud Run
- Summary
- Code Break
- Suggested Resources
- 3. Creating Compelling Dashboards
- Explain Your Model with Dashboards
- Why Build a Dashboard First?
- Accuracy, Honesty, and Good Design
- Explain Your Model with Dashboards
- Loading Data into Cloud SQL
- Create a Google Cloud SQL Instance
- Create Table of Data
- Interacting with the Database
- Querying Using BigQuery
- Schema Exploration
- Using Preview
- Using Table Explorer
- Creating BigQuery View
- Building Our First Model
- Contingency Table
- Threshold Optimization
- Building a Dashboard
- Getting Started with Data Studio
- Creating Charts
- Adding End-User Controls
- Showing Proportions with a Pie Chart
- Explaining a Contingency Table
- Modern Business Intelligence
- Digitization
- Natural Language Queries
- Connected Sheets
- Summary
- Suggested Resources
- 4. Streaming Data: Publication and Ingest with Pub/Sub and Dataflow
- Designing the Event Feed
- Transformations Needed
- Architecture
- Getting Airport Information
- Sharing Data
- Sharing a Cloud Storage dataset
- Sharing a BigQuery dataset
- Dataplex and Analytics Hub
- Designing the Event Feed
- Time Correction
- Apache Beam/Cloud Dataflow
- Parsing Airports Data
- Adding Time Zone Information
- Converting Times to UTC
- Correcting Dates
- Creating Events
- Reading and Writing to the Cloud
- Running the Pipeline in the Cloud
- Publishing an Event Stream to Cloud Pub/Sub
- Speed-Up Factor
- Get Records to Publish
- How Many Topics?
- Iterating Through Records
- Building a Batch of Events
- Publishing a Batch of Events
- Real-Time Stream Processing
- Streaming in Dataflow
- Windowing a Pipeline
- Streaming Aggregation
- Using Event Timestamps
- Executing the Stream Processing
- Analyzing Streaming Data in BigQuery
- Real-Time Dashboard
- Summary
- Suggested Resources
- 5. Interactive Data Exploration with Vertex AI Workbench
- Exploratory Data Analysis
- Exploration with SQL
- Reading a Query Explanation
- Exploratory Data Analysis
- Exploratory Data Analysis in Vertex AI Workbench
- Jupyter Notebooks
- Creating a Notebook
- Jupyter Commands
- Installing Packages
- Jupyter Magic for Google Cloud
- Exploring Arrival Delays
- Basic Statistics
- Plotting Distributions
- Quality Control
- Oddball values
- Outlier removal: Big data is different
- Filtering data on occurrence frequency
- Arrival Delay Conditioned on Departure Delay
- Distribution of arrival delays
- Applying a probabilistic decision threshold
- Empirical probability distribution function
- The answer is...
- Evaluating the Model
- Random Shuffling
- Splitting by Date
- Training and Testing
- Summary
- Suggested Resources
- 6. Bayesian Classifier with Apache Spark on Cloud Dataproc
- MapReduce and the Hadoop Ecosystem
- How MapReduce Works
- Apache Hadoop
- MapReduce and the Hadoop Ecosystem
- Google Cloud Dataproc
- Need for Higher-Level Tools
- Jobs, Not Clusters
- Preinstalling Software
- Quantization Using Spark SQL
- JupyterLab on Cloud Dataproc
- Independence Check Using BigQuery
- Spark SQL in JupyterLab
- Histogram Equalization
- Bayesian Classification
- Bayes in Each Bin
- Evaluating the Model
- Dynamically Resizing Clusters
- Comparing to Single Threshold Model
- Orchestration
- Submitting a Spark Job
- Workflow Template
- Cloud Composer
- Autoscaling
- Serverless Spark
- Summary
- Suggested Resources
- 7. Logistic Regression Using Spark ML
- Logistic Regression
- How Logistic Regression Works
- Spark ML Library
- Getting Started with Spark Machine Learning
- Logistic Regression
- Spark Logistic Regression
- Creating a Training Dataset
- Dealing with corner cases
- Creating training examples
- Creating a Training Dataset
- Training the Model
- Predicting Using the Model
- Evaluating a Model
- Feature Engineering
- Experimental Framework
- Choosing a metric
- Creating the held-out dataset
- Experimental Framework
- Feature Selection
- Creating a large cluster
- Increasing quota
- Autoscale up and down
- Removing features
- Feature Transformations
- Scaling
- Clipping
- Feature Creation
- Categorical Variables
- Repeatable, Real Time
- Summary
- Suggested Resources
- 8. Machine Learning with BigQuery ML
- Logistic Regression
- Presplit Data
- Interrogating the Model
- Evaluating the Model
- Scale and Simplicity
- Logistic Regression
- Nonlinear Machine Learning
- XGBoost
- Hyperparameter Tuning
- Vertex AI AutoML Tables
- Time Window Features
- Taxi-Out Time
- Compounding Delays
- Causality
- Time Features
- Departure Hour
- Transform Clause
- Categorical Variable
- Feature Cross
- Summary
- Suggested Resources
- 9. Machine Learning with TensorFlow in Vertex AI
- Toward More Complex Models
- Preparing BigQuery Data for TensorFlow
- Reading Data into TensorFlow
- Toward More Complex Models
- Training and Evaluation in Keras
- Model Function
- Features
- Inputs
- Training the Keras Model
- Saving and Exporting
- Deep Neural Network
- Wide-and-Deep Model in Keras
- Representing Air Traffic Corridors
- Bucketing
- Feature Crossing
- Wide-and-Deep Classifier
- Deploying a Trained TensorFlow Model to Vertex AI
- Concepts
- Uploading Model
- Creating Endpoint
- Deploying Model to Endpoint
- Invoking the Deployed Model
- Summary
- Suggested Resources
- 10. Getting Ready for MLOps with Vertex AI
- Developing and Deploying Using Python
- Writing model.py
- Writing the Training Pipeline
- Predefined Split
- AutoML
- Developing and Deploying Using Python
- Hyperparameter Tuning
- Parameterize Model
- Shorten Training Run
- Metrics During Training
- Hyperparameter Tuning Pipeline
- Best Trial to Completion
- Explaining the Model
- Configuring Explanations Metadata
- Creating and Deploying Model
- Obtaining Explanations
- Summary
- Suggested Resources
- 11. Time-Windowed Features for Real-Time Machine Learning
- Time Averages
- Apache Beam and Cloud Dataflow
- Why Apache Beam?
- Why Dataflow?
- Starting points
- Apache Beam and Cloud Dataflow
- Reading and Writing
- Reading from BigQuery
- Local JSON input
- Filtering
- Time Averages
- Time Windowing
- Assigning a timestamp
- Sliding windows
- Computing moving average
- Removing duplicates
- Machine Learning Training
- Machine Learning Dataset
- Label
- Data split
- Distance bug
- Monitoring and verification
- Machine Learning Dataset
- Training the Model
- Changes from Chapter 10
- AutoML model
- Custom model
- Streaming Predictions
- Reuse Transforms
- Input and Output
- Invoking Model
- Reusing Endpoint
- Shared handle
- Per-worker instance
- Batching Predictions
- Streaming Pipeline
- Writing to BigQuery
- Executing Streaming Pipeline
- Late and Out-of-Order Records
- Uniformly distributed delay
- Exponential distribution
- Normal distribution
- Watermarks and triggers
- Possible Streaming Sinks
- Choosing a sink
- Cloud Bigtable
- Designing tables
- Designing the row key
- Streaming into Cloud Bigtable
- Querying from Cloud Bigtable
- Summary
- Suggested Resources
- 12. The Full Dataset
- Four Years of Data
- Creating Dataset
- Dataset split
- Shuffling data
- Need for continuous training
- More powerful machines
- Creating Dataset
- Training Model
- Evaluation
- RMSE
- Confusion matrix
- Impact of threshold
- Impact of a feature
- Analyzing errors
- Categorical features
- Four Years of Data
- Summary
- Suggested Resources
- Conclusion
- A. Considerations for Sensitive Data Within Machine Learning Datasets
- Handling Sensitive Information
- Sensitive Data in Columns
- Sensitive Data in Natural Language Datasets
- Sensitive Data in Free-Form Unstructured Data
- Sensitive Data in a Combination of Fields
- Sensitive Data in Unstructured Content
- Handling Sensitive Information
- Protecting Sensitive Data
- Removing Sensitive Data
- Masking Sensitive Data
- Coarsening Sensitive Data
- Establishing a Governance Policy
- Index
O'Reilly Media - inne książki
-
JavaScript gives web developers great power to create rich interactive browser experiences, and much of that power is provided by the browser itself. Modern web APIs enable web-based applications to come to life like never before, supporting actions that once required browser plug-ins. Some are s...(186.15 zł najniższa cena z 30 dni)
186.15 zł
219.00 zł(-15%) -
How will software development and operations have to change to meet the sustainability and green needs of the planet? And what does that imply for development organizations? In this eye-opening book, sustainable software advocates Anne Currie, Sarah Hsu, and Sara Bergman provide a unique overview...(169.14 zł najniższa cena z 30 dni)
177.65 zł
209.00 zł(-15%) -
OpenTelemetry is a revolution in observability data. Instead of running multiple uncoordinated pipelines, OpenTelemetry provides users with a single integrated stream of data, providing multiple sources of high-quality telemetry data: tracing, metrics, logs, RUM, eBPF, and more. This practical gu...(143.65 zł najniższa cena z 30 dni)
152.15 zł
179.00 zł(-15%) -
What will you learn from this book?If you're a software developer looking for a quick on-ramp to software architecture, this handy guide is a great place to start. From the authors of Fundamentals of Software Architecture, Head First Software Architecture teaches you how to think architecturally ...(245.65 zł najniższa cena z 30 dni)
245.65 zł
289.00 zł(-15%) -
If you use Linux in your day-to-day work, then Linux Pocket Guide is the perfect on-the-job reference. This thoroughly updated 20th anniversary edition explains more than 200 Linux commands, including new commands for file handling, package management, version control, file format conversions, an...(92.65 zł najniższa cena z 30 dni)
101.15 zł
119.00 zł(-15%) -
Interested in developing embedded systems? Since they don't tolerate inefficiency, these systems require a disciplined approach to programming. This easy-to-read guide helps you cultivate good development practices based on classic software design patterns and new patterns unique to embedded prog...(152.15 zł najniższa cena z 30 dni)
160.65 zł
189.00 zł(-15%) -
Gain the valuable skills and techniques you need to accelerate the delivery of machine learning solutions. With this practical guide, data scientists, ML engineers, and their leaders will learn how to bridge the gap between data science and Lean product delivery in a practical and simple way. Dav...(245.65 zł najniższa cena z 30 dni)
245.65 zł
289.00 zł(-15%) -
This practical book provides a detailed explanation of the zero trust security model. Zero trust is a security paradigm shift that eliminates the concept of traditional perimeter-based security and requires you to "always assume breach" and "never trust but always verify." The updated edition off...(203.15 zł najniższa cena z 30 dni)
211.65 zł
249.00 zł(-15%) -
Decentralized finance (DeFi) is a rapidly growing field in fintech, having grown from $700 million to $100 billion over the past three years alone. But the lack of reliable information makes this area both risky and murky. In this practical book, experienced securities attorney Alexandra Damsker ...(203.15 zł najniższa cena z 30 dni)
211.65 zł
249.00 zł(-15%) -
Whether you're a startup founder trying to disrupt an industry or an entrepreneur trying to provoke change from within, your biggest challenge is creating a product people actually want. Lean Analytics steers you in the right direction.This book shows you how to validate your initial idea, find t...(126.65 zł najniższa cena z 30 dni)
126.65 zł
149.00 zł(-15%)
Dzieki opcji "Druk na żądanie" do sprzedaży wracają tytuły Grupy Helion, które cieszyły sie dużym zainteresowaniem, a których nakład został wyprzedany.
Dla naszych Czytelników wydrukowaliśmy dodatkową pulę egzemplarzy w technice druku cyfrowego.
Co powinieneś wiedzieć o usłudze "Druk na żądanie":
- usługa obejmuje tylko widoczną poniżej listę tytułów, którą na bieżąco aktualizujemy;
- cena książki może być wyższa od początkowej ceny detalicznej, co jest spowodowane kosztami druku cyfrowego (wyższymi niż koszty tradycyjnego druku offsetowego). Obowiązująca cena jest zawsze podawana na stronie WWW książki;
- zawartość książki wraz z dodatkami (płyta CD, DVD) odpowiada jej pierwotnemu wydaniu i jest w pełni komplementarna;
- usługa nie obejmuje książek w kolorze.
Masz pytanie o konkretny tytuł? Napisz do nas: sklep[at]helion.pl.
Książka, którą chcesz zamówić pochodzi z końcówki nakładu. Oznacza to, że mogą się pojawić drobne defekty (otarcia, rysy, zagięcia).
Co powinieneś wiedzieć o usłudze "Końcówka nakładu":
- usługa obejmuje tylko książki oznaczone tagiem "Końcówka nakładu";
- wady o których mowa powyżej nie podlegają reklamacji;
Masz pytanie o konkretny tytuł? Napisz do nas: sklep[at]helion.pl.
Książka drukowana
Oceny i opinie klientów: Data Science on the Google Cloud Platform. 2nd Edition Valliappa Lakshmanan (0) Weryfikacja opinii następuję na podstawie historii zamówień na koncie Użytkownika umieszczającego opinię. Użytkownik mógł otrzymać punkty za opublikowanie opinii uprawniające do uzyskania rabatu w ramach Programu Punktowego.