Feature Engineering

The performance of data-driven models such as machine learning depends strongly on the quality and relevance of the features used for training. To enhance the predictive capability of RECLAIM, we derived new features from the raw variables described in Section 2, a process commonly referred to as feature engineering. These derived features were designed to capture underlying processes relevant to reservoir sedimentation that are not directly represented in the raw variables. A summary of all derived features is provided in the below table. Please look out the table for the mentioned section to refer used variables.

No

Name

Abbreviation

Definition/Formula

Reference Table

Units

P14

Land cover of artificial surfaces

LCAS

=LCAS%*CA

static variables

km2

P15

Land cover of cropland

LCC

=LCC%*CA

static variables

km2

P16

Land cover of grassland

LCG

=LCG%*CA

static variables

km2

P17

Land cover of trees

LCT

=LCT%*CA

static variables

km2

P18

Land cover of shrubs

LCS

=LCS%*CA

static variables

km2

P19

Land cover of herbaceous vegetation

LCHV

=LCHV%*CA

static variables

km2

P20

Land cover of mangroves

LCM

=LCM%*CA

static variables

km2

P21

Land cover of sparse vegetation

LCSV

=LCSV%*CA

static variables

km2

P22

Land cover of bare soil

LCBS

=LCBS%*CA

static variables

km2

P23

Land cover of snow and glaciers

LCSG

=LCSG%*CA

static variables

km2

P24

Land cover of water bodies

LCWB

=LCWB%*CA

static variables

km2

P73

Age at observation end

AGE

=OEY-BY

static variables

X

P74

Relative original capacity

ROBC

=OBC/CA

static variables

m

P75

Geometry complexity

GC

=RA/RP²

static variables

DL

P76

Net vegetation gain frequency

NVGF

=VGF-VLF

dynamic variables

X

P77

Ratio tree cover to bare soil

R_tree_bare

=LCT/LCBS

static variables

DL

P78

Ratio shrubs to bare soil

R_shrub_bare

=LCS/LCBS

static variables

DL

P79

Ratio coarse to sand

R_coarse_sand

=COAR/SAND

static variables

DL

P80

Relative mean annual surface area

rel_SA_mean_clip

=SA_mean_clip/RA

static & dynamic variables

DL

P81

Ratio surface area to capacity

R_SA_cap

=SA_mean_clip/OBC

static & dynamic variables

m-1

P82

Rainfall per unit area

rain_per_area

=MAR/CA

static & dynamic variables

mm/km2

P83

Trapping efficiency

TE

100*e^(-0.0079*(MAI*3600*24*365/(OBC*1e6)))

static variables

%

P84

Residence time

RT

=OBC*1e6/(MAI*3600*24*365)

static variables

years

P85

Estimated capacity loss rate

ECLR

=TE*NSSC2_mean/RT

dynamic variables

%/year

P86

Estimated sedimentation rate

ESR

=ECLR*OBC/100

dynamic variables

million m3/year

P87

Sediment influx

SIN

=MAI*NSSC2_mean

dynamic variables

m3/s

P88

Sediment outflux

SOUT

=MAO*NSSC2_mean

dynamic variables

m3/s