Positron emission tomography (PET) using scanners incorporating lutetium-based (Lu-based) scintillators are widely used in nuclear medicine. However their application in imaging very low (<100 kBq) activity distributions is quite limited due to the intrinsic 176Lu radiation emitted from the scintillators. To visualize very low activities, 176Lu background needs to be reduced or removed. This study proposes a classification method to select background coincidences from true coincidences arising from the source by supervised learning using the optimal classifier as determined by investigating 5 different classifiers: logistic regression, support vector machine, random forest, extreme gradient boosting (XGBoost) and deep neural network. Five energy and time-of-flight (TOF) related features from each coincidence event are extracted to form the training and test set in the classification. The proposed method was verified on a pair of TOF-PET detector modules. Since the measured source coincidences cannot be differentiated from the background events experimentally, simulated source coincidences are used to train the classification model. The simulated feature spectra are therefore compared with those obtained from measurement to verify the feasibility of classifying measured coincidences using a model learned by simulation. XGBoost classifier performed most effectively in classifying the coincidences and provided impressively high classification accuracy (>99%). It was subsequently tested by imaging point-like source, planar Derenzo and bar phantoms with the pair of TOF-PET detectors. An 89.4% image contrast enhancement for the Derenzo phantom at an activity concentration of 100 Bq mm-2, and a 52.4% peak-to-valley ratio improvement across the area of bar phantom at a concentration of 25 Bq mm-2, were observed on the reconstructed images with XGBoost classification applied. The proposed method could extend the usage of Lu-based PET scanners to very low activity detection and imaging and has the potential to be used in a variety of molecular imaging tasks to detect low-level signals.