Localizando todas as junções necessárias para ingressar programaticamente em uma tabela

Dada uma SourceTable e uma TargetTable, eu gostaria de criar programaticamente uma string com todas as junções necessárias.

Em resumo, estou tentando encontrar uma maneira de criar uma string como esta:

FROM SourceTable t
JOIN IntermediateTable t1 on t1.keycolumn = t.keycolumn
JOIN TargetTable t2 on t2.keycolumn = t1.keycolumn

Eu tenho uma consulta que retorna todas as chaves estrangeiras para uma determinada tabela, mas estou enfrentando limitações ao tentar executar tudo isso recursivamente para encontrar o caminho de junção ideal e criar a string.

SELECT 
    p.name AS ParentTable
    ,pc.name AS ParentColumn
    ,r.name AS ChildTable
    ,rc.name AS ChildColumn
FROM sys.foreign_key_columns fk
JOIN sys.columns pc ON pc.object_id = fk.parent_object_id AND pc.column_id = fk.parent_column_id 
JOIN sys.columns rc ON rc.object_id = fk.referenced_object_id AND rc.column_id = fk.referenced_column_id
JOIN sys.tables p ON p.object_id = fk.parent_object_id
JOIN sys.tables r ON r.object_id = fk.referenced_object_id
WHERE fk.parent_object_id = OBJECT_ID('aTable')
ORDER BY ChildTable, fk.referenced_column_id

Estou certo de que isso já foi feito antes, mas não consigo encontrar um exemplo.

sql-server sql-server-2014 recursive system-tables Metáfora
fonte

E se houver 2 ou mais caminhos da origem para o destino?

ypercubeᵀᴹ

Sim, eu ficaria preocupado com vários caminhos em potencial e também com um único caminho com mais de duas etapas. Além disso, chaves compostas por mais de uma coluna. Todos esses cenários irão lançar uma chave em qualquer solução automatizada.

Aaron Bertrand

Observe que mesmo uma única chave estrangeira entre duas tabelas permitirá 2 ou mais caminhos (na verdade, um número ilimitado de caminhos de comprimento arbitrário). Considere a consulta "encontre todos os itens que foram colocados pelo menos uma vez na mesma ordem com o item X". Você vai precisar para se juntar OrderItemscom Orderse para trás com OrderItems.

ypercubeᵀᴹ

@ypercube Certo, também, o que exatamente "o caminho ideal" significa?

Aaron Bertrand

"Caminho JOIN ideal" significa "a série mais curta de junções que unirá a tabela Target à tabela Source". Se T1 for referenciado em T2 e T3, T2 será referenciado em T4 e T3 será referenciado em T4. O caminho ideal de T1 para T3 é T1, T2, T3. O caminho T1, T2, T4, T3 não seria ideal, pois é mais longo.

Metáfora

Respostas:

Eu tinha um script que faz uma versão rudimentar da passagem de chave estrangeira. Eu o adaptei rapidamente (veja abaixo), e você pode usá-lo como ponto de partida.

Dada uma tabela de destino, o script tenta imprimir a cadeia de junção para o caminho mais curto (ou um deles no caso de vínculos) para todas as tabelas de origem possíveis, de modo que chaves estrangeiras de coluna única possam ser percorridas para alcançar a tabela de destino. O script parece estar funcionando bem no banco de dados com algumas milhares de tabelas e muitas conexões FK nas quais eu tentei.

Como outros mencionam nos comentários, você precisará tornar isso mais complexo se precisar manipular chaves estrangeiras de várias colunas. Além disso, lembre-se de que esse código não está totalmente pronto para produção e totalmente testado. Espero que seja um ponto de partida útil se você decidir criar essa funcionalidade!

-- Drop temp tables that will be used below
IF OBJECT_ID('tempdb..#paths') IS NOT NULL
    DROP TABLE #paths
GO
IF OBJECT_ID('tempdb..#shortestPaths') IS NOT NULL
    DROP TABLE #shortestPaths
GO

-- The table (e.g. "TargetTable") to start from (or end at, depending on your point of view)
DECLARE @targetObjectName SYSNAME = 'TargetTable'

-- Identify all paths from TargetTable to any other table on the database,
-- counting all single-column foreign keys as a valid connection from one table to the next
;WITH singleColumnFkColumns AS (
    -- We limit the scope of this exercise to single column foreign keys
    -- We explicitly filter out any multi-column foreign keys to ensure that they aren't misinterpreted below
    SELECT fk1.*
    FROM sys.foreign_key_columns fk1
    LEFT JOIN sys.foreign_key_columns fk2 ON fk2.constraint_object_id = fk1.constraint_object_id AND fk2.constraint_column_id = 2
    WHERE fk1.constraint_column_id = 1
        AND fk2.constraint_object_id IS NULL
)
, parentCTE AS (
    -- Base case: Find all outgoing (pointing into another table) foreign keys for the specified table
    SELECT 
        p.object_id AS ParentId
        ,OBJECT_SCHEMA_NAME(p.object_id) + '.' + p.name AS ParentTable
        ,pc.column_id AS ParentColumnId
        ,pc.name AS ParentColumn
        ,r.object_id AS ChildId
        ,OBJECT_SCHEMA_NAME(r.object_id) + '.' + r.name AS ChildTable
        ,rc.column_id AS ChildColumnId
        ,rc.name AS ChildColumn
        ,1 AS depth
        -- Maintain the full traversal path that has been taken thus far
        -- We use "," to delimit each table, and each entry then has a
        -- "<object_id>_<parent_column_id>_<child_column_id>" format
        ,   ',' + CONVERT(VARCHAR(MAX), p.object_id) + '_NULL_' + CONVERT(VARCHAR(MAX), pc.column_id) +
            ',' + CONVERT(VARCHAR(MAX), r.object_id) + '_' + CONVERT(VARCHAR(MAX), pc.column_id) + '_' + CONVERT(VARCHAR(MAX), rc.column_id) AS TraversalPath
    FROM sys.foreign_key_columns fk
    JOIN sys.columns pc ON pc.object_id = fk.parent_object_id AND pc.column_id = fk.parent_column_id 
    JOIN sys.columns rc ON rc.object_id = fk.referenced_object_id AND rc.column_id = fk.referenced_column_id
    JOIN sys.tables p ON p.object_id = fk.parent_object_id
    JOIN sys.tables r ON r.object_id = fk.referenced_object_id
    WHERE fk.parent_object_id = OBJECT_ID(@targetObjectName)
        AND p.object_id <> r.object_id -- Ignore FKs from one column in the table to another

    UNION ALL

    -- Recursive case: Find all outgoing foreign keys for all tables
    -- on the current fringe of the recursion
    SELECT 
        p.object_id AS ParentId
        ,OBJECT_SCHEMA_NAME(p.object_id) + '.' + p.name AS ParentTable
        ,pc.column_id AS ParentColumnId
        ,pc.name AS ParentColumn
        ,r.object_id AS ChildId
        ,OBJECT_SCHEMA_NAME(r.object_id) + '.' + r.name AS ChildTable
        ,rc.column_id AS ChildColumnId
        ,rc.name AS ChildColumn
        ,cte.depth + 1 AS depth
        ,cte.TraversalPath + ',' + CONVERT(VARCHAR(MAX), r.object_id) + '_' + CONVERT(VARCHAR(MAX), pc.column_id) + '_' + CONVERT(VARCHAR(MAX), rc.column_id) AS TraversalPath
    FROM parentCTE cte
    JOIN singleColumnFkColumns fk
        ON fk.parent_object_id = cte.ChildId
        -- Optionally consider only a traversal of the same foreign key
        -- With this commented out, we can reach table A via column A1
        -- and leave table A via column A2.  If uncommented, we can only
        -- enter and leave a table via the same column
        --AND fk.parent_column_id = cte.ChildColumnId
    JOIN sys.columns pc ON pc.object_id = fk.parent_object_id AND pc.column_id = fk.parent_column_id 
    JOIN sys.columns rc ON rc.object_id = fk.referenced_object_id AND rc.column_id = fk.referenced_column_id
    JOIN sys.tables p ON p.object_id = fk.parent_object_id
    JOIN sys.tables r ON r.object_id = fk.referenced_object_id
    WHERE p.object_id <> r.object_id -- Ignore FKs from one column in the table to another
        -- If our path has already taken us to this table, avoid the cycle that would be created by returning to the same table
        AND cte.TraversalPath NOT LIKE ('%_' + CONVERT(VARCHAR(MAX), r.object_id) + '%')
)
SELECT *
INTO #paths
FROM parentCTE
ORDER BY depth, ParentTable, ChildTable
GO

-- For each distinct table that can be reached by traversing foreign keys,
-- record the shortest path to that table (or one of the shortest paths in
-- case there are multiple paths of the same length)
SELECT *
INTO #shortestPaths
FROM (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY ChildTable ORDER BY depth ASC) AS rankToThisChild
    FROM #paths
) x
WHERE rankToThisChild = 1
ORDER BY ChildTable
GO

-- Traverse the shortest path, starting from the source the full path and working backwards,
-- building up the desired join string as we go
WITH joinCTE AS (
    -- Base case: Start with the from clause to the child table at the end of the traversal
    -- Note that the first step of the recursion will re-process this same row, but adding
    -- the ParentTable => ChildTable join
    SELECT p.ChildTable
        , p.TraversalPath AS ParentTraversalPath
        , NULL AS depth
        , CONVERT(VARCHAR(MAX), 'FROM ' + p.ChildTable + ' t' + CONVERT(VARCHAR(MAX), p.depth+1)) AS JoinString
    FROM #shortestPaths p

    UNION ALL

    -- Recursive case: Process the ParentTable => ChildTable join, then recurse to the
    -- previous table in the full traversal.  We'll end once we reach the root and the
    -- "ParentTraversalPath" is the empty string
    SELECT cte.ChildTable
        , REPLACE(p.TraversalPath, ',' + CONVERT(VARCHAR, p.ChildId) + '_' + CONVERT(VARCHAR, p.ParentColumnId)+ '_' + CONVERT(VARCHAR, p.ChildColumnId), '') AS TraversalPath
        , p.depth
        , cte.JoinString + '
' + CONVERT(VARCHAR(MAX), 'JOIN ' + p.ParentTable + ' t' + CONVERT(VARCHAR(MAX), p.depth) + ' ON t' + CONVERT(VARCHAR(MAX), p.depth) + '.' + p.ParentColumn + ' = t' + CONVERT(VARCHAR(MAX), p.depth+1) + '.' + p.ChildColumn) AS JoinString
    FROM joinCTE cte
    JOIN #paths p
        ON p.TraversalPath = cte.ParentTraversalPath
)
-- Select only the fully built strings that end at the root of the traversal
-- (which should always be the specific table name, e.g. "TargetTable")
SELECT ChildTable, 'SELECT TOP 100 * 
' +JoinString
FROM joinCTE
WHERE depth = 1
ORDER BY ChildTable
GO

Geoff Patterson
fonte

Você pode colocar a lista de chaves de uma tabela com dois campos TAB_NAME, KEY_NAME para todas as tabelas que você deseja conectar.

Exemplo, para tabela City

Cidade | City_name
Cidade | Country_name
Cidade | Province_name
Cidade | Código da Cidade

da mesma forma Provincee Country.

Colete os dados para as tabelas e coloque em uma única tabela (por exemplo, tabela de metadados)

Agora rascunhe a consulta como abaixo

select * from
(Select Table_name,Key_name from Meta_Data 
where Table_name in ('City','Province','Country')) A,
(Select Table_name,Key_name from Meta_Data 
where Table_name in ('City','Province','Country')) B,
(Select Table_name,Key_name from Meta_Data 
where Table_name in ('City','Province','Country')) C

where

A.Table_Name <> B.Table_name and
B.Table_name <> C.Table_name and
C.Table_name <> A.Table_name and
A.Column_name = B.Column_name and
B.Column_name = C.Column_name

Isso mostra como você pode vincular as tabelas com base nas chaves correspondentes (mesmos nomes de chave)

Se você acha que o nome da chave pode não corresponder, inclua um campo de chave alternativo e tente usá-lo na condição where.

i44
fonte

Observe que o solicitante desejava usar as systabelas existentes no SQL Server que descrevem as colunas em uma tabela, como as tabelas são vinculadas, etc. Tudo o que já existe. Construir suas próprias tabelas que definem sua estrutura de tabela para atender a uma necessidade específica pode ser uma posição de fallback, mas a resposta preferida usaria o que já existe, como a resposta aceita .

precisa saber é o seguinte