Updated on 2022-11-18 GMT+08:00

Hive SQL Is Incompatible with SQL2003 Standards

This document describes the incompatibility issues between Hive SQL and SQL2003.

  • The view cannot be written in having.

    For example:

    select c_last_name
          ,c_first_name
          ,s_store_name
          ,sum(netpaid) paid
    from ssales
    where i_color = 'chiffon'
    group by c_last_name
            ,c_first_name
            ,s_store_name
    having sum(netpaid) > (select 0.05*avg(netpaid) from ssales);

    Error message:

    Error: Error while compiling statement: FAILED:ParseException line 46:23 cannot recognize input near 'select' '0.05' '*' in expression specification (state=42000,code=40000)
  • The exists query entity association is not supported.

    For example:

    select
           o_orderpriority,
           count(*) as order_count
    from
           orders
    where
           o_orderdate >= date '1996-09-01'
           and o_orderdate < date '1996-09-01' + interval '3' month
           and exists (
                  select
                         *
                  from
                         lineitem
                  where
                         l_orderkey = o_orderkey
                         and l_commitdate < l_receiptdate
           )
    group by
           o_orderpriority
    order by
           o_orderpriority;

    Error message:

    Error: Error while compiling statement: FAILED: SemanticException [Error 10250]: Line 1:181 Invalid SubQuery expression 'l_receiptdate': For Exists/Not Exists operator SubQuery must be Correlated. (state=42000,code=10250)

  • The Having does not support sub-query.

    For example:

    select
           ps_partkey,
           sum(ps_supplycost * ps_availqty) as value
    from
           partsupp,
           supplier,
           nation
    where
           ps_suppkey = s_suppkey
           and s_nationkey = n_nationkey
           and n_name = 'SAUDI ARABIA'
    group by
           ps_partkey having
                  sum(ps_supplycost * ps_availqty) > (
                         select
                                sum(ps_supplycost * ps_availqty) * 0.0001000000
                         from
                                partsupp,
                                supplier,
                                nation
                         where
                                ps_suppkey = s_suppkey
                                and s_nationkey = n_nationkey
                                and n_name = 'SAUDI ARABIA'
                  )
    order by
           value desc;

    Error message:

    Error: Error while compiling statement: FAILED: SemanticException Line 0:-1 Unsupported SubQuery Expression ''SAUDI ARABIA'': Only SubQuery expressions that are top level conjuncts are allowed (state=42000,code=40000)
  • Multiple query results cannot be displayed as multiple fields.

    For example:

    select
           c_count,
           count(*) as custdist
    from
           (
                  select
                         c_custkey,
                         count(o_orderkey)
                  from
                         customer left outer join orders on
                                c_custkey = o_custkey
                                and o_comment not like '%pending%requests%'
                  group by
                         c_custkey
           ) as c_orders (c_custkey, c_count)
    group by
           c_count
    order by
           custdist desc,
           c_count desc;

    Error message:

    Error: Error while compiling statement: FAILED: ParseException line 1:213 missing EOF at '(' near 'c_orders' (state=42000,code=40000)
  • The query results cannot be compared as fields.

    For example:

    select
           sum(l_extendedprice) / 7.0 as avg_yearly
    from
           lineitem,
           part
    where
           p_partkey = l_partkey
           and p_brand = 'Brand#25'
           and p_container = 'MED JAR'
           and l_quantity < (
                  select
                         0.2 * avg(l_quantity)
                  from
                         lineitem
                  where
                         l_partkey = p_partkey
           );

    Error message:

    Error: Error while compiling statement: FAILED: SemanticException [Error 10249]: Line 14:4 Unsupported SubQuery Expression 'ps_suppkey': Correlating expression contains ambiguous column references. (state=42000,code=10249)
  • The multi-table association query does not support the sub-query filter by not in or in.

    For example:

    select
     p_brand,
     p_type,
     p_size,
     count(distinct ps_suppkey) as supplier_cnt
    from
     partsupp,
     part
    where
     p_partkey = ps_partkey
     and p_brand <> 'Brand#12'
     and p_type not like 'PROMO PLATED%'
     and p_size in (25, 2, 43, 9, 35, 36, 48, 24)
     and ps_suppkey in (
      select
       s_suppkey as ps_suppkey
      from
       supplier
      where
       s_comment like '%Customer%Complaints%'
     )
    group by
     p_brand,
     p_type,
     p_size
    order by
     supplier_cnt desc,
     p_brand,
     p_type,
     p_size;
    

    Error message:

    Error: Error while compiling statement: FAILED: SemanticException [Error 10249]: Line 14:4 Unsupported SubQuery Expression 'ps_suppkey': Correlating expression contains ambiguous column references. (state=42000,code=10249)
  • The multi-table association does not support filtering of not in and in sub-queries.

    For example:

    select
           c_name,
           c_custkey,
           o_orderkey,
           o_orderdate,
           o_totalprice,
           sum(l_quantity)
    from
           customer,
           orders,
           lineitem
    where
           o_orderkey in (
                  select
                         l_orderkey
                  from
                         lineitem
                  group by
                         l_orderkey having
                                sum(l_quantity) > 315
           )
           and c_custkey = o_custkey
           and o_orderkey = l_orderkey
    group by
           c_name,
           c_custkey,
           o_orderkey,
           o_orderdate,
           o_totalprice
    order by
           o_totalprice desc,
           o_orderdate
    limit 100;

    Error message:

    Error: Error while compiling statement: FAILED: SemanticException [Error 10249]: Line 13:0 Unsupported SubQuery Expression 'o_orderkey': Correlating expression contains ambiguous column references. (state=42000,code=10249)
  • The association conditions do not support multiple exists query entities.

    For example:

    select
           s_name,
           count(*) as numwait
    from
           supplier,
           lineitem l1,
           orders,
           nation
    where
           s_suppkey = l1.l_suppkey
           and o_orderkey = l1.l_orderkey
           and o_orderstatus = 'F'
           and l1.l_receiptdate > l1.l_commitdate
           and exists (
                  select
                         *
                  from
                         lineitem l2
                  where
                         l2.l_orderkey = l1.l_orderkey
                         and l2.l_suppkey <> l1.l_suppkey
           )
           and not exists (
                  select
                         *
                  from
                         lineitem l3
                  where
                         l3.l_orderkey = l1.l_orderkey
                         and l3.l_suppkey <> l1.l_suppkey
                         and l3.l_receiptdate > l3.l_commitdate
           )
           and s_nationkey = n_nationkey
           and n_name = 'VIETNAM'
    group by
           s_name
    order by
           numwait desc,
           s_name
    limit 100;

    Error message:

    Error: Error while compiling statement: FAILED: SemanticException [Error 10249]: Line 23:8 Unsupported SubQuery Expression 'l_commitdate': Only 1 SubQuery expression is supported. (state=42000,code=10249)
  • The multi-level in-nested sub-query is not supported.

    For example:

    select i_item_id item_id,
             sum(sr_return_quantity) sr_item_qty
      from store_returns,
           item,
           date_dim
      where sr_item_sk = i_item_sk
      and   d_date    in
             (select d_date
             from date_dim
             where d_week_seq in
                     (select d_week_seq
                     from date_dim
               where d_date in ('1998-01-02','1998-10-15','1998-11-10')))
      and   sr_returned_date_sk   = d_date_sk
      group by i_item_id),
      cr_items as
      (select i_item_id item_id,
             sum(cr_return_quantity) cr_item_qty
      from catalog_returns,
           item,
           date_dim
      where cr_item_sk = i_item_sk);

    Error message:

    Unsupported SubQuery Expression 'd_week_seq': SubQuery cannot use the table alias: date_dim; this is also an alias in the Outer Query and SubQuery contains a unqualified column reference (state=42000,code=10249)