Posts

Mar 26, 2024
GMV Forecasting via xDeepFM

: Machine Learning, Deep Learning

In this post, I aim to share how I conducted a proof of concept (PoC) to solve a real-world problem using deep learning techniques, emphasizing a clean code structure.
Nov 16, 2023
Decoding LoRA: A Comprehensive Summary on Low-Rank Adaptation

: Machine Learning, Deep Learning, NLP, LLM

Recently, I came across an intriguing article on low-rank techniques employed in Large Language Models (LLM) specifically focusing on LoRA: Low-Rank Adaptation of Large Language Models. Here’s a succinct summary of the key concepts, along with additional discussions.
Nov 15, 2023
Comparing Sequence-to-Sequence Decoders: With and Without Attention

: Machine Learning, Deep Learning, NLP

This post goes beyond a conventional code walkthrough inspired by this tutorial. My goal is to elevate the narrative by offering a comprehensive comparison between Seq2Seq (sequence-to-sequence) models with and without attention.
Jun 9, 2022
Note: Derivation of Normal Bayesian Test

: Statistics

The Normal Bayesian test is a statistical method used in hypothesis testing, particularly in the context of Bayesian statistics. It is applied to assess the validity of a null hypothesis ($H_0$) versus an alternative hypothesis ($H_1$) by considering the posterior distribution of a parameter of interest. This method is exemplified in Statistical Inference (2nd edition) by Casella and Berger, as shown on page 379.
Jan 20, 2021
Capturing Dominant Spatial Patterns with Two-Dimensional Locations Using SpatPCA

: Software, R, Statistics, Spatial Statistics

In this demonstration, we showcase how to utilize SpatPCA for analyzing two-dimensional data to capture the most dominant spatial pattern.
Jan 18, 2021
Apply SpatPCA to Capture the Dominant Spatial Pattern with One-Dimensional Locations

: Software, R, Statistics, Spatial Statistics

In this tutorial, we explore the application of SpatPCA to capture the most dominant spatial patterns in one-dimensional data, highlighting its performance under varying signal-to-noise ratios.
Sep 13, 2017
個人資料去識別化

: De-identification, Statistics

在這提倡開放資料(open data)的時代，各機構從資料收集、維護到能夠公開分享，如何妥善的保護資料中的個人隱私是個廣泛討論的議題。例如，使用者希望政府資料能夠增加資料的透明程度以及增加新的資訊而增加資料的可用性，進而可以提供政府更有效的決策。為了確保隱私，避免暴露其身份，從圖1我們可以清楚了解資料收集個人資訊到最終分享資料的流程。當中，在公開資料前為了達到隱私保護的效果，必須對於資料執行適當保護措施 – 去識別化(de-identification)。去識別化是讓機構能夠從自己資料庫移除個人資訊的工具，使得其資料能二次使用或是分享給其他機構做學術或是商業相關的研究用途。

圖1. - 資料去識別化流程圖 (來源 Garfinkel[5]).
Jun 26, 2017
Challenges in EOF Patterns with a Single Variable

: Statistics, Spatial Statistics

In this post, we delve into potential challenges associated with Empirical Orthogonal Function (EOF) patterns.
Jun 17, 2017
Exploring Dominant Spatial Patterns with a Single Variable

: Statistics, Spatial Statistics

In this post, we delve into the formal exploration of dominant spatial patterns, particularly in the context of climate research.
Jun 3, 2017
How to Work on Sea Surface Temperature (SST) Data

: R, Spatial Statistics, Statistics

In this post, I will show you step-by-step instructions to work on SST data in R.
May 19, 2017
Three Fundamental Aspects of Statistical Models

: Statistics

Recently, I delved into a classic Annals article, Additive Regression and Other Nonparametric Models” by Stones (1985). The insights gleaned from this piece revolve around three fundamental aspects of statistical models: