2
\$\begingroup\$

This function is a working function, which builds a number of queries depending on the values of $_POST.

group, class, category, subcategory, brand, product are all arrays sent via post from a userform which uses the Chosen jQuery plugin for multi-select boxes. So here I assume that the user cannot maliciously enter their own input (I may be wrong, please correct me if so). A user could easily change the elements of my select fields, and so my IN() clause is vulnerable. How can I dynamically bind parameters, or at least sanitize these inputs before using them in the query.

The only factor in which I am interested is SQL Injection. I am aware that the use of (so many) globals is not recommended for scoping/encapsulation reasons, but this is from a single procedural page which will not have includes anywhere else on the 'site'.

I think I have whitelisted possible values by the use of switch statements, however given this is the first time I have created a dynamic query using direct vars as opposed to only parameters, I am unsure if I have achieved what I want. Which is a dynamic SQL query, that has dynamic columns, which is as safe as can be from SQL injection.

EDIT From what I have observed, any column names are fine as the names themselves are hardcoded and not supplied by the user, but determined by the combination of counts in their selections.

I am most concerned about where I take the arrays and implode them to an IN() clause. But given that the amount of values in each array is unknown I can't think of a reasonable way to bind new parameters the proper way.

    //DETERMINE THE LEVEL TO VIEW DATA AT
    //Return boolean for each combination of filter selection
    //filters cannot be random (i.e. by group, and then by brand)
    //it has to be drilled down i.e. group->class->category->subcategory->brand
    //example
    //if there is 1 'group' selected, show all classes within that group at class level
    //if there are 0, 2 or more 'groups' selected, show those at 'group' level
    //if there is 1 'class' selected, show all categories within that class at category level etc.

    $grouplevel = (count($group) == 0 || count($group) > 1);

    $classlevel = (count($group) == 1) && (    (count($class) == 0) || (count($class) > 1)    );

    $catlevel = (count($class) == 1) && (    (count($category) == 0) || (count($category) > 1)    );

    $sublevel = (count($category) == 1) && (    (count($subcategory) == 0) || (count($subcategory) > 1)    );

    $brandlevel = (count($subcategory) == 1) && (   (count($brand) == 0) || (count($brand) > 1)   );

    $productlevel = (count($brand) == 1);

    //define $level for query builder
    //this is the name of the column used in 
    //the GROUP BY clause for retrieving CHART DATA
    switch(true){

        case $grouplevel:
            $level = "`group`"; //backticks because group is a reserved word in MySQL
        break;
        case $classlevel:
            $level = "class";
        break;
        case $catlevel:
            $level = "category";
        break;
        case $sublevel:
            $level = "subcategory";
        break;
        case $brandlevel:
            $level = "brand";
        break;
        case $productlevel:
            $level = "product";
        break;
    }

    //define empty vars for use in function
    $salesgrowth = "";
    $averageprice = "";
    $changeinrange = "";
    $salestrend = "";
    //separate level var for change in range count chart. 
    //If charts are viewed at product level, make sure that change in range count still only goes to brand level
    $crlevel = ""; 
    $caption = "";
    //query builder function

    function buildQueries($level, $period, $measure){

        //empty vars for queries
        global $salesgrowth;
        global $averageprice;
        global $changeinrange;
        global $salestrend;

        //POST values
        global $yr;
        global $yrmth;
        global $ly;
        global $destination;
        global $group;
        global $class;
        global $category;
        global $subcategory;
        global $brand;
        global $product;
        global $crlevel;

        global $caption;

        $currentyr = date("Y");
        $currentmth = date("m");

        //remove backticks for use as array index
        if($level == "`group`"){ 
            $lev = "group"; 
        } else { 
            $lev = $level;
        }


        switch($lev){

            case "group":

                $intersect = " `group` IN('";
                $intersect.= implode("','", $group);
                $intersect.="')";
            break;

            case "class":

                if(count($class) == 0){

                    $intersect = " `group` = '".$group[0]."'";

                } else {

                    $intersect = " `group` = '".$group[0]."' AND class IN('";
                    $intersect.= implode("','", $class);
                    $intersect.="')";
                }
            break;

            case "category" :

                if(count($category) == 0){

                    $intersect = " `group` = '".$group[0]."' AND class = '".$class[0]."'";

                } else {

                    $intersect = " `group` = '".$group[0]."' AND class = '".$class[0]."' AND category IN('";
                    $intersect.= implode("','", $category);
                    $intersect.="')";
                }
            break;

            case "subcategory":

                if(count($subcategory) == 0){

                    $intersect = " `group` = '".$group[0]."' AND  class = '".$class[0]."' AND category = '".$category[0]."'";

                } else {

                    $intersect = " `group` = '".$group[0]."' AND class = '".$class[0]."' AND category = '".$category[0]."' AND subcategory IN('";
                    $intersect.= implode("','", $subcategory);
                    $intersect.="')";
                }
            break;

            case "brand":

                if(count($brand) == 0){

                    $intersect = " `group` = '".$group[0]."' AND  class = '".$class[0]."' AND category = '".$category[0]."' AND subcategory = '".$subcategory[0]."'";

                } else {

                    $intersect = " `group` = '".$group[0]."' AND class = '".$class[0]."' AND category = '".$category[0]."' AND subcategory = '".$subcategory[0]."' AND brand IN('";
                    $intersect.= implode("','", $brand);
                    $intersect.="')";
                }
            break;

            case "product":

                if(count($product) == 0){

                    $intersect = " `group` = '".$group[0]."' AND  class = '".$class[0]."' AND category = '".$category[0]."' AND subcategory = '".$subcategory[0]."' AND brand = '".$brand[0]."'";

                } else {

                    $intersect = " `group` = '".$group[0]."' AND class = '".$class[0]."' AND category = '".$category[0]."' AND subcategory = '".$subcategory[0]."' AND brand = '".$brand[0]."' AND sku IN('";
                    $intersect.= implode("','", $product);
                    $intersect.="')";
                }
            break;
        }


        switch($period){

            case "year":

                $col = "yr"; // set the crosstab column in the growth query
                $yrmth = "%"; //yrmth as wildcard because were only filtering by year
                $filt = "`yr` LIKE :yr AND `yrmth` LIKE :yrmth "; // section to put into price query WHERE clause
                $filt1 = "`yr` LIKE :yr1 AND `yrmth` LIKE :yrmth1 "; // section to put into price query WHERE clause
                $lyr = $yr - 1; //set value for last year
                $lyfilt = "= $lyr"; // growth query CASE for period 1
                $cyfilt = "= $yr";  // into growth query CASE for period 2

                $caption ="<div class='col-sm-10 alert alert-info' role='alert' align='left'>";
                $caption.="Analysing <strong>$yr</strong> ".ucwords($measure)." at <strong>$lev</strong> level.</div>";
            break;

            case "month":

                $col = "yrmth"; //set the crosstab column in the growth query
                $filt = "`yr` LIKE :yr AND `yrmth` LIKE :yrmth "; // average price query WHERE clause
                $filt1 = "`yr` LIKE :yr1 AND `yrmth` LIKE :yrmth1 "; // average price query WHERE clause
                $lyrmth = ($yr - 1).substr($yrmth, 4, 2); // set the current month for last year so growth can be compared
                $lyfilt = "= $lyrmth"; // growth query CASE for period 1
                $cyfilt = "= $yrmth"; // growth query CASE for period 2

                $caption ="<div class='col-sm-10 alert alert-info' role='alert' align='left'>";
                $caption.="Analysing <strong>$yrmth</strong> ".ucwords($measure)." at <strong>$lev</strong> level.</div>";
            break;

            case "ytd":

                $col = "yrmth"; //set the crosstab column in the growth query
                //set beginning of year value
                $yr = $currentyr."01";  //use the $yr variable for <period start> so query parameters don't need to change
                $yrmth = $currentyr.$currentmth; //set current yrmth
                $lyr = ($currentyr - 1)."01"; //set beginning of last year
                $lyrmth = ($currentyr - 1).$currentmth; // set this month last year
                //average price query WHERE clause beginning - notice that :yr($yr) is used at <period start>
                $filt = "`yrmth` BETWEEN :yr AND :yrmth "; 
                $filt1 = "`yrmth` BETWEEN :yr1 AND :yrmth1 "; 
                $lyfilt = "BETWEEN $lyr AND $lyrmth"; // growth query case for period 1
                $cyfilt = "BETWEEN $yr AND $yrmth"; // growth query case for period 2
                $caption ="<div class='col-sm-10 alert alert-info' role='alert' align='left'>";
                $caption.="Analysing <strong>YTD</strong> ".ucwords($measure)." at <strong>$lev</strong> level.</div>";
            break;

            case "ttm":

                $col = "yrmth"; //set the crosstab column in the growth query
                $yr = ($currentyr - 1)."0".($currentmth + 1); //set beginning of TTM period
                $yrmth = $currentyr.$currentmth;            //set end of ttm period (current yrmth)
                $lyr = ($currentyr - 2)."0".($currentmth + 1); // set beginning of previous TTM period
                $lyrmth = ($currentyr - 1).$currentmth;     // set end of previous TTM period 
                $filt = "`yrmth` BETWEEN :yr AND :yrmth "; // average prive query WHERE clause
                $filt1 = "`yrmth` BETWEEN :yr1 AND :yrmth1 "; // average prive query WHERE clause
                $lyfilt = "BETWEEN $lyr AND $lyrmth";   //growth query CASE for period 1
                $cyfilt = "BETWEEN $yr AND $yrmth";     //grrowth query CASE for period 2
                $caption ="<div class='col-sm-10 alert alert-info' role='alert' align='left'>";
                $caption.="Analysing ".ucwords($measure)." between <strong>$yr</strong> and <strong>$yrmth</strong> at <strong>$lev</strong> level.</div>";
            break;
        }

        //ensure there is no AND in the query when no filters are selected
        if(empty($intersect)){ $and = ""; } else { $and = "AND"; }

        // define average price query - note that $level (GROUP BY column) is taken from function arguments
        $averageprice = "SELECT $level,
                                SUM(a.`sales`) as sales, SUM(a.`units`) as units, ROUND(a.sales / a.units, 2) as asp,
                                SUM(a.`sales`) / b.total * 100 as share
                            FROM `brandlevel_data` as a
                            CROSS JOIN (SELECT SUM(sales) as total FROM `brandlevel_data` WHERE $filt1 $and $intersect AND destination LIKE :destination) as b
                            WHERE $filt $and
                                $intersect AND 
                                destination LIKE :destination1
                            GROUP BY $level";

        // define sales growth query - note that $col is defined in the switch above and is the column to
        // aggregate in the crosstab, $lyfilt and $cyfilt are also defined in the switch above
        // and define the CASE logic for each aggregate. $level used from function arguments to group by
        $salesgrowth = "SELECT $level,
                               SUM(CASE WHEN $col $lyfilt THEN $measure ELSE 0 END) AS `period1`,
                               SUM(CASE WHEN $col $cyfilt THEN $measure ELSE 0 END) AS `period2`,

                                SUM(CASE WHEN $col $cyfilt THEN $measure ELSE 0 END) - 
                                SUM(CASE WHEN $col $lyfilt THEN $measure ELSE 0 END) AS `growth`,

                                COALESCE((((SUM(CASE WHEN yr = 2014 THEN sales ELSE 0 END) - SUM(CASE WHEN yr = 2013 THEN sales ELSE 0 END)) / 
                                (SUM(CASE WHEN yr = 2013 THEN sales ELSE 0 END))) * 100), 0) + 100 AS `index` 
                        FROM brandlevel_data as a
                        WHERE $intersect AND 
                            destination LIKE :destination 
                        GROUP BY $level";   

        //define change in range count query
        //to avoid change in range count displaying 0 when at product level
        //reverse this chart only back to brand level and show change in range count for brand
        //by changing the query intersect values
        if($level == "product"){ 

            $crlevel = "brand";
            $crintersect = " subcategory = '".$subcategory[0]."' AND brand = '".$brand[0]."'";

        } else { 

            $crlevel = $level; 
            $crintersect = $intersect;
        }


        $changeinrange = "SELECT $crlevel,
                                COUNT(DISTINCT(CASE WHEN $col $lyfilt THEN sku ELSE NULL END)) as period1,
                                COUNT(DISTINCT(CASE WHEN $col $cyfilt THEN sku ELSE NULL END)) as period2,
                                COUNT(DISTINCT(CASE WHEN $col $cyfilt THEN sku ELSE NULL END)) -
                                COUNT(DISTINCT(CASE WHEN $col $lyfilt THEN sku ELSE NULL END)) as var
                            FROM brandlevel_data
                            WHERE $crintersect AND 
                                destination LIKE :destination 
                            GROUP BY $crlevel";


        $salestrend = "SELECT RIGHT(yrmth, 2) as period, 
                                SUM(CASE WHEN yr = :ly THEN sales ELSE 0 END) as period1,
                                SUM(CASE WHEN yr = :cy THEN sales ELSE 0 END) as period2
                        FROM brandlevel_data
                        WHERE $intersect AND 
                            destination LIKE :destination 
                        GROUP BY period";
    }

    buildQueries($level, $period, $measure); //execute buildqueries function


    //prepare data queries & bind parameters
    $averagepricedata = $db->prepare($averageprice);
    $averagepricedata->bindParam(":yr", $yr); // when comparing period other than 'Year' this param is used for <period start>
    $averagepricedata->bindParam(":yrmth", $yrmth); // when comparing period other than 'Yrmth' this param is ussed for <period end>
    $averagepricedata->bindParam(":yr1", $yr); // when comparing period other than 'Year' this param is used for <period start>
    $averagepricedata->bindParam(":yrmth1", $yrmth); // when comparing period other than 'Yrmth' this param is ussed for <period end>
    $averagepricedata->bindParam(":destination", $destination); //destination bound twice, because it's used twice in the query
    $averagepricedata->bindParam(":destination1", $destination);

    //echo "<br>".$averageprice."<br><br><br>";  //echo query for debugging
    //echo "<br>".$salesgrowth."<br><br>";  // echo query for debugging
    //echo "<br>".$changeinrange."<br><br><br>"; //echo query for debugging
    //echo "<br>".$salestrend."<br><br><br>"; // echo query for debugging

    $salesgrowthdata = $db->prepare($salesgrowth);
    $salesgrowthdata->bindParam(":destination", $destination);

    $changeinrangedata = $db->prepare($changeinrange);
    $changeinrangedata->bindParam(":destination", $destination);

    $salestrenddata = $db->prepare($salestrend);
    $salestrenddata->bindParam(":cy", $selectedyear);
    $salestrenddata->bindParam(":ly", $selectedlastyear);
    $salestrenddata->bindParam(":destination", $destination);
\$\endgroup\$
3
  • 1
    \$\begingroup\$ "So here I assume that the user cannot maliciously enter their own input". You are wrong, a user can post whatever they like at your website - they don't have to use your UI at all. (+1 for the Question though) \$\endgroup\$
    – RobH
    Commented Mar 2, 2016 at 14:56
  • \$\begingroup\$ I thought I may be wrong there. Do you have any links to sources which can give me examples of how those malicious users would achieve this? Definitely somethign I need to be more aware of \$\endgroup\$ Commented Mar 2, 2016 at 15:10
  • 1
    \$\begingroup\$ As it's html, someone could open their browser's dev tools, change one of the options to something like '; drop table brandlevel_data; -- and then submit your form. Either that, or they could compose their own request in any number of ways or point an automated tool at your website. \$\endgroup\$
    – RobH
    Commented Mar 2, 2016 at 15:14

2 Answers 2

4
\$\begingroup\$

Security and Readability

So here I assume that the user cannot maliciously enter their own input (I may be wrong, please correct me if so).

Yes, you are wrong about that. POST and GET are 100% user controlled and not limited by your frontend (be it JavaScript, HTML, or something else).

The only factor in which I am interested is SQL Injection. I am aware that the use of (so many) globals is not recommended for scoping/encapsulation reasons, but this is from a single procedural page which will not have includes anywhere else on the 'site'.

Still, even for a single page, using global is not a good idea. In combination with your deeply nested code, it makes it very hard to see what is actually going on. And that has a direct impact on security.

Just seeing $averagepricedata = $db->prepare($averageprice); looks nice. It's using prepared statements and everything. But what is averageprice really? It's this:

    $averageprice = "SELECT $level,
                            SUM(a.`sales`) as sales, SUM(a.`units`) as units, ROUND(a.sales / a.units, 2) as asp,
                            SUM(a.`sales`) / b.total * 100 as share
                        FROM `brandlevel_data` as a
                        CROSS JOIN (SELECT SUM(sales) as total FROM `brandlevel_data` WHERE $filt1 $and $intersect AND destination LIKE :destination) as b
                        WHERE $filt $and
                            $intersect AND 
                            destination LIKE :destination1
                        GROUP BY $level";

That looks way less nice. There are definitely variables directly inside an SQL query. So is this secure? To know that, we have to dive deep into the code, as we have no idea what the variables actually stand for.

If we do that, we can see that:

  • $level is an argument to the function. As the function is currently used, it can only contain hardcoded values, so right now, it's safe. But functions should be reusable, so we are not actually sure if it will always be safe.
  • $and is just AND or an empty string.
  • $filt and $filt1 are hardcoded.
  • $intersect is quite complex, but it contains at least parts of $group, which is a global, and as you said user controlled.

Thus we found our SQL injection.

Note that all the other parameters you specified as user input are also vulnerable. It is likely that other parameters are vulnerable as well.

For example $measure in SUM(CASE WHEN $col $lyfilt THEN $measure ELSE 0 END) AS `period1`, (although it's not clear where that comes from).

Another example would be $yr, which is also a global, and is inserted into a query here: $cyfilt = "BETWEEN $yr AND $yrmth";.

Note also that depending on what the global $yr is, and what you do with $caption, you are likely open to XSS.

From a security standpoint, I would rewrite the code from scratch. It will be extremely difficult to fix it in its current form, because it's so confusing.

Ideally, you could extract parts of the query generation to their own functions to increase readability.

Here, it is also extremely important to know where what variables actually come from (which is one reason against global). In each function you create, you want to be extra careful with the arguments that are passed to it.

So you might eg build a createIntersectSubQuery($group) function. Now, inside that function, you know that $group is dangerous and should not be put into the query, but only used to calculate the amount of ? you need.

given that the amount of values in each array is unknown I can't think of a reasonable way to bind new parameters the proper way.

See eg here.

Misc

  • don't shorten variable names. crlevel, ly, yr, lyr, yrmth (year month?), lev, filt, filt1, cyfilt, lyfilt, col are all unclear.
  • use camelCase or snake_case to make variables more readable.
\$\endgroup\$
2
\$\begingroup\$

Following up on @tim 's response I would also recommend sanitizing and normalizing the users input before interacting with it. Safe data is better data.

Ref:

http://php.net/manual/en/function.array-filter.php

http://php.net/manual/en/function.filter-var.php

http://php.net/manual/en/function.filter-id.php

\$\endgroup\$
2
  • \$\begingroup\$ Excellent links. So using a filter such as full_special_chars, would I then benefit from the same sanitisation as binding a parameter for example? \$\endgroup\$ Commented Mar 3, 2016 at 17:51
  • \$\begingroup\$ IMO, I would do both. Sanitize input AND parameterize the query. To be on the safe side, look around for input sensitization libraries. \$\endgroup\$ Commented Mar 3, 2016 at 20:27

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.