Jeff Hsu

Software Engineer

Passionate about program development and optimization.
Enjoys solving problems and thriving on challenges.
Keen on researching and sharing technologies.
Experienced in team collaboration and leadership.


Skills

Programming
  • Python
  • C#
  • Java
  • SQL
Cloud service
  • AWS Lambda
  • AWS SQS
  • AWS SageMaker
  • GCP BigQuery
  • GCP Cloud Run
  • GKE
  • Azure Functions
  • Azure App Service
Data
  • Kafka
  • Flink
  • Elasticsearch
  • Redis
  • Looker studio
Database
  • MySQL
  • MS SQL Server
  • PostgreSQL
  • Milvus
DevOps
  • Kubernetes
  • Nginx
  • GitLab CI
  • AWS CDK
  • AWS SAM

Experience

Software Engineer

Viberse Technology Pte. Ltd.
  • Reverse Media Search System
  • Built a reverse media search system using the vector database Milvus, the sentence-transformer machine learning library, and the CLIP model. Implemented functionality to search similar images and videos by text, image or video.

  • Backend Development and ML Model Deployment
  • Developed and deployed machine learning models on cloud platforms, implemented scalable backend services with a microservices architecture, integrated Redis caching, and configured Nginx for optimized request routing and rate limiting to ensure system stability and low-latency APIs.

  • MySQL CDC Sync to Elasticsearch using Flink DataStream
  • Developed a CDC connector and Elasticsearch sink using the Java Flink DataStream API, extracted MySQL binlog for real-time synchronization of entire database tables to corresponding Elasticsearch indexes, including newly created tables.

  • Application Services CI/CD
  • Created CI/CD pipelines using AWS CDK, AWS SAM and Secrets Manager to deploy Api Gateway, SQS, Lambda functions and layers in multiple regions and multi-environment.

April 2024 - September 2024

Software Engineer

ONElab Technology Ltd.
  • Pipeline Service Development and Integration
  • Developed and maintained robust data pipeline services utilizing Python and .NET C#, integrating data from cloud SQL, on-premises databases, Kafka, and various departmental projects for comprehensive data aggregation and analysis.

  • Data Warehouse Restructuring and Cost Optimization
  • Led a technical project to restructure the data warehouse, reducing departmental BigQuery costs by 30%, increasing scheduling efficiency and flexibility, and simplifying the development process.

  • Event-Driven Architecture Design and Implementation
  • Designed and implemented an event-driven architecture, integrating streaming data, AI models, and Telegram API for real-time analysis and automatic ad blocking and alert notifications based on user chat messages.

  • DevOps Proficiency in Software Deployment and Pipeline Management
  • Implemented and deployed software and services through Kubernetes, cloud functions, and GitLab CI to streamline pipeline construction and enhance operational efficiency.

  • Data-Driven User Behavior Analysis and Metrics Computation
  • Collected real-time data and developed data models to compute user footprints and various behavioral metrics. Continuously performed multi-dimensional user segmentation and labeling with a CDP.

  • Skype Bot Development
  • Developed a Skype bot for automated message broadcasting to targeted users based on various indicators.

  • Monitoring and Alert System Establishment
  • Established a comprehensive monitoring and alert system using Slack Webhook, Cloud Logging, Scheduled Queries, and Looker Studio.

  • Data Modeling-Driven Business Intelligence
  • Formulated computational logic for various product and operational metrics, and developed Looker Studio dashboards to facilitate data analysis for various departments and project teams, enabling informed decision-making and project monitoring.
October 2022 - March 2024

Data Engineer Intern

Foresii Co., Ltd.
  • Speaker of the seminar on corporate sustainable development and innovative technology practices

  • Developed backend APIs using Flask and deployed services with Nginx, Gunicorn, and AWS EC2

  • Developed and deployed serverless web crawlers and controller endpoints using Azure Durable Functions and App Service

  • Collect data and conduct data analysis based on project requirements
March 2022 - August 2022

Projects

ERP System

An intuitive, user-friendly, and intelligent HR management and potential order tracking system that reduces organizational implementation costs. Combined with multi-dimensional data analysis, it efficiently uncovers gaps and opportunities.

  • Backend Developer | Architect


Segmentation management.
Real-time data visualization.
Authority hierarchy.
Intelligent recommendation system.


pure css3 slider
pure css3 slider
pure css3 slider
pure css3 slider
pure css3 slider


pure css3 slider
pure css3 slider
pure css3 slider
pure css3 slider
pure css3 slider




Key technologies
  • Cross-departmental communication, planning, and deployment of front-end and back-end server and database architecture.
    Tools : AWS, MariaDB

  • Backend functionality development, building recommendation systems, asynchronous triggering of spiders.
    Tools : Flask, SQL, RESTful API, subprocess

  • Setting up a WSGI server as a Linux daemon.
    Tools : Gunicorn, gevent, coroutine, systemd

  • Configure domain forwarding and reverse proxy to prevent CORS, and transmit the client's IP address to the backend server for logging.
    Tools : Nginx, X-Forwarded-For, realip

customized spiders and job vacancies analysis

Automatically collects job vacancy information from major job-seeking platforms for data analysis and horizontal comparison. Activates customized spiders and multi-faceted database filtering based on input keywords.

  • Project Leader | Data Engineer





Select the target website.
Enter keywords based on needs.
Routine monitoring of processes.


Key technologies
  • Develop spiders and design database architecture.
    Tools : Python Selenium, Beautiful Soup, Azure Database for MySQL flexible server

  • Deploy Async HTTP APIs, redirect to polling status endpoints, resolve timeout response issues, and optimize user experience.
    Tools : Azure Durable Functions

  • Develop multi-faceted query and filtering features to enhance data usability efficiency.
    Tools : Flask, SQL, Swagger UI

  • Schedule triggers for spiders to automatically update resources weekly.
    Tools : Azure Functions Timer trigger

  • Set up cloud servers.
    Tools : Azure App Service

  • Visualize and analyze data.
    Tools : Azure Active Directory, Power BI

Minecraft Competition Cybersecurity Analysis

Analyzes Minecraft server logs and Linux firewall data to cross-reference and pinpoint the sources of network attacks.

  • Data Engineer

pure css3 slider Minecraft Server Log Amount、Level、Type
pure css3 slider Authenticator、Warning、Watchdog and Crash
pure css3 slider Linux ufw block、Src IP and User ID
pure css3 slider IP、Window、TTL、TOS
pure css3 slider Cross Analysis



Custom Data Pipeline.
Analyze system security logs.
Cross-analysis with dashboards.


Key technologies
  • Monitor system paths and read server logs.
    Tools : Filebeat

  • Parse logs and extract timestamp information as timestamps.
    Tools : Logstash

  • Mapping and storing indexes.
    Tools : Elasticsearch

  • Perform data visualization and create cross-analysis dashboards.
    Tools : Kibana

Linebot Log System

Detects the current server's host IP and network status, and automatically broadcasts updates through the official LINE chat room.

  • Application Developer

pure css3 slider
pure css3 slider
pure css3 slider
pure css3 slider
pure css3 slider


Key technologies
  • Detect current IP and network conditions.
    Tools : subprocess, ifaddresses

  • Integrate with a Line official account and broadcast messages.
    Tools : LineBot API