Figure: Areas with the same attribute value (first image) are merged into into one (second image)
Attributes of merged areas can be aggregated using various aggregation methods such as sum and mean. The specific methods available depend on the backend used for aggregation. Two aggregate backends (specified in aggregate_backend) are available, univar and sql. When univar is used, the methods available are the ones which v.db.univar uses by default, i.e., n, min, max, range, mean, mean_abs, variance, stddev, coef_var, and sum. When the sql backend is used, the methods in turn depends on the SQL database backend used for the attribute table of the input vector. For SQLite, it is at least the following build-in aggregate functions: count, min, max, avg, sum, and total. For PostgreSQL, the list of aggregate functions is much longer and includes, e.g., count, min, max, avg, sum, stddev, and variance. The sql aggregate backend, regardless of the underlying database, will typically perform significantly better than the univar backend.
Aggregate methods are specified by name in aggregate_methods
or using SQL syntax in aggregate_columns.
If result_columns is provided including type information
and the sql backend is used,
aggregate_columns can contain SQL syntax specifying both columns
and the functions applied, e.g.,
aggregate_columns="sum(cows) / sum(animals)"
.
In this case, aggregate_methods should to be omitted.
This provides the highest flexibility and it is suitable for scripting.
The backend is, by default, determined automatically based on the requested methods. Specifically, the sql backend is used by default, but when a method is not one of the SQLite build-in aggregate functions and, at the same time, is available with the univar backend, the univar backed is used. The default behavior is intended for interactive use and testing. For scripting and other automated usage, specifying the backend explicitly is strongly recommended.
For convince, certain methods, namely n, count, mean, and avg, are converted to the name appropriate for the selected backend. However, for scripting, specifying the appropriate method (function) name for the backend is recommended because the conversion is a heuristic which may change in the future.
If only aggregate_columns is provided, methods default to n, min, max, mean, and sum. If the univar backend is specified, all the available methods for the univar backend are used.
If the result_columns is not provided, each method is applied to each specified column producing result columns for all combinations. These result columns have auto-generated names based on the aggregate column and method. If the result_column is provided, each method is applied only once to the matching column in the aggregate column list and the result will be available under the name of the matching result column. In other words, number of items in aggregate_columns, aggregate_methods (unless omitted), and result_column needs to match and no combinations are created on the fly. For scripting, it is recommended to specify all resulting column names, while for interactive use, automatically created combinations are expected to be beneficial, especially for exploratory analysis.
Type of the result column is determined based on the method selected.
For n and count, the type is INTEGER and for all other
methods, it is DOUBLE. Aggregate methods which produce other types
require the type to be specified as part of the result_columns.
A type can be provided in result_columns using the SQL syntax
name type
, e.g., sum_of_values double precision
.
Type specification is mandatory when SQL syntax is used in
aggregate_columns (and aggregate_methods is omitted).
Multiple attributes may be linked to a single vector entity through numbered fields referred to as layers. Refer to v.category for more details.
Merging of areas can also be accomplished using v.extract -d which provides some additional options. In fact, v.dissolve is simply a front-end to that module. The use of the column parameter adds a call to v.reclass before.
v.dissolve input=undissolved output=dissolved
g.copy vect=soils_general,mysoils_general v.dissolve mysoils_general output=mysoils_general_families column=GSL_NAME
# patch tiles after import: v.patch -e `g.list type=vector pat="clc2000_*" separator=","` out=clc2000_patched # remove duplicated tile boundaries: v.clean clc2000_patched out=clc2000_clean tool=snap,break,rmdupl thresh=.01 # dissolve based on column attributes: v.dissolve input=clc2000_clean output=clc2000_final col=CODE_00
v.dissolve input=boundary_municp column=DOTURBAN_N output=municipalities \ aggregate_columns=ACRES
DOTURBAN_N == 'Wadesboro'
:
v.db.select municipalities where="DOTURBAN_N == 'Wadesboro'" separator=tab
cat DOTURBAN_N ACRES_n ACRES_min ACRES_max ACRES_mean ACRES_sum 66 Wadesboro 2 634.987 3935.325 2285.156 4570.312
v.dissolve input=boundary_municp column=DOTURBAN_N output=municipalities_2 \ aggregate_columns=ACRES aggregate_methods=sum
v.dissolve input=boundary_municp column=DOTURBAN_N output=municipalities_3 \ aggregate_columns=ACRES,NEW_PERC_G aggregate_methods=sum,avg
The v.dissolve module will apply each aggregate method only to the corresponding column when column names for the results are specified manually with the result_columns option:
v.dissolve input=boundary_municp column=DOTURBAN_N output=municipalities_4 \ aggregate_columns=ACRES,NEW_PERC_G aggregate_methods=sum,avg \ result_columns=acres,new_perc_g
v.dissolve input=boundary_municp column=DOTURBAN_N output=municipalities_5 \ aggregate_columns=ACRES,DOTURBAN_N,TEXT_NAME aggregate_methods=sum,count,count \ result_columns=acres,number_of_parts,named_parts
Modifying the previous example, we will now specify the SQL aggregate function calls
explicitly instead of letting v.dissolve generate them for us.
We will compute sum of the ACRES column using sum(ACRES)
(alternatively, we could use SQLite specific total(ACRES)
which returns zero even when all values are NULL).
Further, we will count number of aggregated (i.e., dissolved) parts using
count(*)
which counts all rows regardless of NULL values.
Then, we will count all unique names of parts as distinguished by
the MB_NAME column using count(distinct MB_NAME)
.
Finally, we will collect all these names into a comma-separated list using
group_concat(MB_NAME)
:
v.dissolve input=boundary_municp column=DOTURBAN_N output=municipalities_6 \ aggregate_columns="total(ACRES),count(*),count(distinct MB_NAME),group_concat(MB_NAME)" \ result_columns="acres REAL,named_parts INTEGER,unique_names INTEGER,names TEXT"
When working with general SQL syntax, v.dissolve turns off its checks for number of aggregate and result columns to allow for all SQL syntax to be used for aggregate columns. This allows us to use also functions with multiple parameters, for example specify separator to be used with group_concat:
v.dissolve input=boundary_municp column=DOTURBAN_N output=municipalities_7 \ aggregate_columns="group_concat(MB_NAME, ';')" \ result_columns="names TEXT"
DOTURBAN_N == 'Wadesboro'
:
v.db.select municipalities_7 where="DOTURBAN_N == 'Wadesboro'" separator=tab
cat DOTURBAN_N names 66 Wadesboro Wadesboro;Lilesville
Available at: v.dissolve source code (history)
Accessed: Wednesday Feb 14 00:13:57 2024
Main index | Vector index | Topics index | Keywords index | Graphical index | Full index
© 2003-2023 GRASS Development Team, GRASS GIS 8.4.0dev Reference Manual