What Is Feature Engineering

[Solved] What Is Feature Engineering | Shell - Code Explorer | yomemimo.com
Question : feature engineering

Answered by : syed-nayeem-ridwan

- Create new features (eg: averaging, BMI etc )
- Visualize distribution with boxplot, pairplot of dataset to see if Transformation is necessary (eg: log transformation)
- Normalize/Standardize/Scale features
- Encoding : Convert categories into numeric data - One-hot encoding : Explainable features, create N columns for N categories - Dummy encoding : Necessary information without duplication, create N-1 columns for N categories
- Merge low frequent categorical values (uncommon categories) into one single category (eg: `other`)
- Binarise numeric values (eg: from `num_violations` to `violation_boolean`)
- Deal with missing values: - drop missing values that are beyond threshold (>30% of dataset) - fill completely random missing values (with mean, median, mode, `Other`, sorted next present value)
- Deal with outliers
- Validate numeric columns - remove characters from numeric data (eg: `$` or `,` sign for currency) - make sure the column is in proper datatype (eg: `float`, `int` etc)
- For text processing : Generate numeric features 1. Remove unwanted/non-letter characters 2. Standardize text : convert to lowercase / uppercase 3. Generate Feature, Mean word length : average length of words in text = character_count / word_count 4. Generate Feature, Bag of words : Word Count Vector = number of times a word appeared in a text 5. Generate Feature, Normalized significance of words : Calculate TF-IDF = normalization of word vector (significance of word in a document compared to all words in all documents) 6. Generate Feature, contextual n-gram significance of word sequence : Calculate TF-IDF = normalization of word vector (significance of word in a document compared to all words in all documents)

Source : | Last Update : Sat, 20 Jan 24

Answers related to what is feature engineering

Code Explorer Popular Question For Shell