Skip to main content

🍊 From Oranges to Algorithms

Will AI help me find my dream apartment in Valencia?

Introduction

  1. Overview of the Project

    • Objective: To leverage modern technologies in the process of buying a flat in Spain.
    • Scope: Utilizing machine learning, cloud computing, and DevOps practices to streamline the property search, evaluation, and purchase processes.
  2. Relevance of Technologies

    • Explanation of why incorporating these technologies is beneficial.
    • Current trends and advancements in real estate and tech integration.

Step 1: Defining the Requirements

  1. Identify User Needs

    • Location preferences (neighborhoods, proximity to my places of interest).
    • Budget constraints.
    • Specific requirements (size of the flat, amenities, terrace, elevator, etc.).
    • User (me) needs to not panic about the process of buying an apartment so that's why she is focusing on this project instead :P so far so good!
  2. Data Sources

    • Real estate websites and databases (Idealista.com, Fotocasa, etc.).
    • Public data (crime rates, school quality, environmental factors).
    • Market trends and price analytics.

Step 2: Data Collection and Preprocessing

  1. Web Scraping and APIs

    • Tools: Scrapy, BeautifulSoup, Selenium.
    • Real estate APIs: Integrating with platforms providing property listings. I have been using Idealista API for the past year
    • Scraper set up on Raspberry Pi using Cron
    • The script checks if the apartment is already present in the database and if yes, if the price has changed.
  2. Data Cleaning

    • Handling missing values.
    • Normalizing data (consistent formats for prices, addresses, etc.).
  3. Data Storage

    • Cloud database: MongoDB
    • NoSQL chosen due to its flexible schema design that allows for the efficient handling of varied and complex data structures, typical of real estate listings with numerous attributes.

Step 3: Machine Learning Model Development

  1. Model Selection

    • Types: Supervised learning for price prediction, unsupervised learning for clustering similar properties.
    • Algorithms: Linear Regression, Random Forest, K-Means Clustering.
  2. Model Training

    • Dataset: Historical property prices, features (size, location, amenities).
    • Frameworks: TensorFlow, PyTorch, Scikit-learn.
  3. Model Evaluation

    • Metrics: Mean Absolute Error (MAE), R-squared.
    • Cross-validation techniques.

Step 4: Cloud Infrastructure and Deployment

See the cloud setup for this project here

  1. Cloud Providers

    • AWS
  2. CI/CD Pipeline

    • Tools: Jenkins, CircleCI, GitHub Actions.
    • Steps: Code integration, automated testing, continuous deployment.
  3. Containerization and Orchestration

    • Docker: Containerizing the ML models and applications.
    • Ansible: Configuration management, automation, and server orchestration.

Step 5: DevOps Practices

  1. Infrastructure as Code (IaC)

    • Tools: Terraform, AWS CloudFormation, Azure Resource Manager.
    • Automation of infrastructure setup and management.
  2. Monitoring and Logging

    • Tools: Prometheus, Grafana, ELK Stack (Elasticsearch, Logstash, Kibana).
    • Setting up alerts and dashboards for real-time monitoring.
  3. Security

    • Practices: Encryption, Identity and Access Management (IAM), Secure APIs.
    • Tools: AWS IAM, Azure Security Center, Google Cloud Identity.

Step 6: Showcasing the results

  • Frameworks: React, JavaScript, Docusaurus
  • Features: Property search filters, interactive maps, price predictions.
  • Website hosting: Netlify

Step 7: Testing and Feedback

  1. User Testing

    • Beta testing with a group of users.
    • Collecting feedback for improvements.
  2. Iterative Improvement

    • Implementing changes based on user feedback.
    • Continuous improvement cycle.

Step 8: Final Deployment and Maintenance

  1. Deployment Strategy

    • Phased rollout, blue-green deployment.
  2. Post-Deployment Monitoring

    • Continuous monitoring of application performance and user feedback.
  3. Maintenance Plan

    • Regular updates, bug fixes, and feature enhancements.

Quick summary of stack used in this project

  1. Machine Learning Models

    • Proficiency in ML frameworks and algorithms (TensorFlow, PyTorch, Scikit-learn).
    • Good practices regarding ML project set up and delivery
  2. Cloud Computing

    • AWS
  3. DevOps Skills

    • CI/CD, containerization (Docker), orchestration (Kubernetes).
  4. Data Engineering

    • Data pipelines, ETL processes, Big Data technologies.
  5. Programming Languages

  6. Infrastructure as Code (IaC)

    • Terraform, Ansible
  7. Monitoring and Logging

    • Prometheus, Grafana, internal logs in Python, Amazon CloudWatch
  8. Security Best Practices

    • IAM, encryption, secure coding practices.
  9. Front-end development

    • Interactive website hosted on Netlify, built with React
  10. User notifications

  • Telegram bot scanning Idealista for personal recommendations

This step-by-step schema provides a comprehensive guide to executing a tech-driven project for buying a flat in Spain, while also highlighting the essential skills for a Machine Learning Ops Engineer in today's market.

{/* from tiktok:

  1. Sql Storage (e.g. RDS, Mysql, Oracle, TiDB/TiKV)
  2. Nosql (e.g. Redis, Memcache, Mongo, leveldDB, RocksDB)
  3. Big Data Frameworks (e.g. HDFS, Hadoop, Yarn, Flink, Kafka, Spark, Storm, K8s)
  4. Distributed Coordination Service (e.g. ETCD, Zookeeper)
  5. Application Performance Management(e.g. Grafana, ClickHouse, Hive, Falcon, Zabbix, Prometheus)
  6. Cloud Native Tech(e.g. Kubernetes/K8S, Docker)
  • Experience in resource management and task scheduling for large scale distributed systems.
  • Proficiency in at least one machine learning framework: Hardware-Software Co-Design, High Performance Computing, ML Hardware Acceleration (e.g. GPU/TPU/RDMA) or ML Framework (e.g. TensorFlow/PyTorch) */}